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Introduction - German and Dutch 
in contrast: synchronic, diachronic and 
psycholinguistic perspectives 


The present volume is a contribution to Contrastive Linguistics (= CL), abranch 
of comparative linguistics whose remit is the fine-grained, potentially holistic 
comparison of asmall number of socioculturally and/or genealogically related 
languages with a focus on divergences rather than convergences (Gast 2013). 
Unlike typological comparison, which draws on large samples of diverse languages 
in search of constraints on linguistic diversity (Croft 2003), Contrastive Linguistics 
came into being in the mid-20th century in the context of foreign-language peda- 
gogy. Its earliest supporters (Fries 1945; Lado 1957) started from the “Contrastive 
Analysis Hypothesis” (Wardhaugh 1970), i.e. the belief “that a detailed compara- 
tive and contrastive study of the native (L1) and the second (L2) language might 
reveal exactly which problems learners with the same L1 have in learning the L2” 
(Ringbom 1994: 737). While this assumption soon proved untenable in its original 
form (ibid.: 738-740), a later, more moderate version known as Error Analysis 
(James 1998) was more successful. Treating the learner’s first language as just 
one factor among many in the complex process of language acquisition/learn- 
ing, it continues to play an important role in language pedagogy alongside related 
approaches, not least in contexts such as second-language teaching in multicul- 
tural societies (Leontiy (ed.) 2012). The recent surge in the development of learner 
corpora (Gaeta 2015) has also helped keep the pedagogical implications of CL in 
focus. 

Even as early optimism regarding Contrastive Analysis gave way to disillusion- 
ment and then realism, the practice of contrastive research was taking hold in 
linguistics. Involving a large number of European languages on either side of the 
Iron Curtain, often in combination with English, many of the respective projects 
and conferences yielded impressive results that were quite independent of their 
original pedagogical objectives (Ringbom 1994: 741f.). This process of emancipa- 
tion reached its apex with John Hawkins’ aptly titled monograph A comparative 
typology of English and German: Unifying the contrasts (Hawkins 1986), in which 
the comparison of two genealogically related, yet in some ways markedly differ- 
ent languages was re-cast as an application of linguistic typology. Looking beyond 
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individual contrasts between German and English for potential generalisations, 
Hawkins suggested that these two languages were located at opposite poles of 
“atypological continuum whereby languages vary according to the degree to which 
surface forms and semantic representations correspond” (ibid.: 123). According to 
this hypothesis, German grammar is semantically more transparent than English 
grammar in part because German inflectional morphology clarifies the functional 
roles of “noun phrases (NPs)” in the clause (ibid.: 121-127, 215-217; cf. Fischer 2013 
and Hawkins 2018 for recent discussion). Although a more mixed picture is now 
presented in König and Gast’s survey Understanding German-English contrasts 
(König/Gast 2018, first published in 2007 and today in its fourth, repeatedly revised 
and expanded edition), Hawkins’s approach was able to highlight two strengths 
of CL: its ability to serve as “small-scale typology” (König 2012: 25) or “pilot typol- 
ogy” (van der Auwera 2012), and its capacity to unify specific contrasts in a broader, 
potentially holistic perspective. This ensures the continuing relevance of CL, not 
only for language pedagogy and linguistic typology, but also for other disciplines 
with an intrinsic interest in contrastive comparison such as translation studies 
(Vandepitte/De Sutter 2013) and psycholinguistics, given the role of crosslinguistic 
evidence in the language-and-cognition debate (cf. below). 

Besides these affiliated fields, a particularly close ally of CL is historical-com- 
parative linguistics. A well-established line of research on the borderline between 
CL and historical-comparative linguistics is the sustained trilingual comparison 
of German and English with Dutch. First conceived by van Haeringen (1956) in his 
book Nederlands tussen Duits en Engels (‘Dutch between German and English’), 
its aim is to profile Dutch through a comparison with German and English, a 
configuration aptly labelled the “Germanic Sandwich” (see inter alia Ruigendijk/ 
van de Velde/Vismans 2012). Van Haeringen’s main observation is that Dutch holds 
the middle between German and English, systematically and for historical reasons, 
in domains of the linguistic system as diverse as the relationship of orthography 
to phonology, the amount of foreign influence on the lexicon, the richness of 
nominal and verbal morphology, the productivity of nominal compounding, and 
the flexibility of word order. The desire to test this hypothesis against new phe- 
nomena or data, and indeed to expand it to new combinations of languages as 
long as Dutch remains in focus, has spawned the now well-known Germanic 
Sandwich conference series which began in Berlin (2005) and then moved on to 
Sheffield (2008), Oldenburg (2010), Leuven (2013), Nottingham (2015), Miinster 
(2017) and Amsterdam (2019), with Cologne (2021) waiting in the wings. It has 
also produced publications such as the volume commemorating the fiftieth anni- 
versary of van Haeringen’s original monograph (Hiining et al. (eds.) 2006), several 
thematic journal issues (Journal of Germanic Linguistics 22.4, 2010, and 28.4, 
2016; Leuvense Bijdragen/Leuven Contributions in Linguistics and Philology 98, 
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2012, and 101, 2017) and indeed the present volume, which brings together papers 
that were mostly presented at the 2017 conference in Münster. 

The book is organized in three sections, reflecting different perspectives on 
the contrastive comparison of German, Dutch, English and/or other Germanic 
languages. They include a section of synchronic studies in the tradition of CL, a 
section of diachronic studies in the historical-comparative tradition and, for the 
first time in a Sandwich-related volume, a section on psycholinguistics, a multi- 
disciplinary field which has recently come to focus increasingly on processes of 
acquisition and on the use of experimental data from a contrastive perspective. 


1 Synchronic perspectives 


While tackling topics already addressed by van Haeringen (1956) such as the 
distinction between weak and strong verbs, nominal number morphology, and the 
grammatical gender system, contributions to the Germanic Sandwich meetings 
and collections have been broader in scope, often including linguistic phenomena 
outside the analytic-synthetic dimension as traditionally defined. Citing at random 
examples from the relevant collections, we find discussions of phenomena from 
the expected domains of phonology, morphology and syntax like impersonal pro- 
nouns (Weerman 2006; van der Auwera/Gast/Vanderbiesen 2012), the formation of 
clippings (Leuschner 2006), combinations of modal particles (Braber/McLelland 
2010) and voice onset in the laryngeal system (Simon/Leuschner 2010), but also 
sociolinguistic topics such as lexical borrowing from French (Hunter/Foolen 2012; 
cf. Sapir 1921: 140 on a possible link with the analytic-synthetic dimension) and 
learners’ perceptions of interlinguistic distance (Vismans/Wenzel 2012). While 
some papers refer only to two of the three original languages, the total set of 
languages in focus has become broader than van Haeringen had envisaged and 
now includes languages like Swedish or Afrikaans. Not surprisingly, the extent to 
which Dutch appears to hold an intermediate position between German and Eng- 
lish (or indeed between any other pair of contrasting languages) differs between 
individual papers, and so does the apparent strength of any links between the con- 
trasts observed and more general typological differences between the languages in 
focus. The range of theories and methodologies is markedly broader, too, drawing 
routinely on cognitive frameworks, corpus data and psycholinguistic methods. 
As for the synchronic perspective on contrastive research, the present volume 
opens with two papers revealing classic Sandwich patterns in linguistic domains 
not previously investigated from this perspective. Sebastian Kiirschner examines 
German, Dutch, and English nickname formation through a contrastive corpus 
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of nicknames as found in the online profiles of amateur athletes. As prototypes, 
parallels and divergences in the formation and creation of nicknames are high- 
lighted, Dutch turns out to hold an intermediate position between German and 
English in several respects. In the second article of this section, Tanja Mortelmans 
and Elena Smirnova address the English way-construction [SUB], V POSS, way OBL] 
and its reflexive analogues in German and Dutch from a cognitive point of view, 
arguing that the different constructions are best compared using conceptual 
terms describing middle situations in the domain of autocausative motion. Again, 
a Sandwich pattern emerges, with Dutch part-way between the extremes of 
English, where the way-construction has come to predominate at the cost of the 
historically prior reflexive resultative construction, and German, which has no 
schematic Weg-construction at all. Next, Tom Bossuyt compares the distribution 
of English -ever, German immer and/or auch, and Dutch (dan) ook in universal 
concessive-conditional and free relative subordinate clauses (e.g. German was 
immer du auch willst ‘whatever you want’) and in their elliptically reduced versions 
(e.g. Dutch of wat dan ook ‘or whatever’), based on more than 38,000 example 
sentences from a combination of large language-specific corpora with the smaller 
multilingual ConverGENTiecorpus. Although a sandwich-like pattern emerges in 
this case, too, it has German between Dutch and English rather than Dutch 
between German and English. In the closing paper of the synchronic section, 
Peter Dirix, Liesbeth Augustinus and Frank Van Eynde investigate the “infinitivus 
pro participio” (IPP) effect, a type of construction in which some verbs select an 
infinitive instead of a past participle to form the perfect in Dutch, German and 
Afrikaans. Using corpus data to identify the verbs which (obligatorily or optionally) 
show the IPP effect in Afrikaans, they compare the verb classes showing the IPP 
effect in Afrikaans with those in Dutch and German, pinpointing crosslinguistic 
similarities and differences without any clear Sandwich pattern emerging. 


2 Diachronic perspectives 


A landmark in the contrastive study of Dutch, van Haeringen’s (1956) book was 
not written primarily with pedagogical applications in mind, nor did van Haerin- 
gen engage directly in historical research. Instead, he set out to broadly compare 
the structures of Dutch, German and English and thereby seek insights into dia- 
chronic divergences leading to synchronic contrasts. His key diachronic concept 
in explaining the divergences is analytische verbrokkeling (‘analytic crumbling’), 
i.e. the process by which the West Germanic languages shifted from the synthetic 
to the analytic type. This process, he shows, has progressed further in English than 
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in Dutch and further in Dutch than in German, which still displays significant 
similarities to the West Germanic ancestor language (cf. also König 2012 for a 
broader Germanic view). 

The holistic nature of van Haeringen’s account and its explanatory aspirations 
are reminiscent of typological work by linguists like Sapir (1921). Seeking to identify 
more general, abstract structures in languages so as to develop more powerful 
hypotheses on the causes of language change, Sapir identifies three parallel “drifts 
of major importance” in Indo-European languages (ibid.: 134), viz. the reduction of 
the case system, the tendency towards fixed word order and, finally, the “drift 
toward the invariable word” which Sapir regards as the dominant development of 
the three (ibid.: 139). Although van Haeringen (1956) does not mention Sapir by 
name, the similarities are striking, as indeed are the affinities with Hawkins (1986), 
who interprets the apparent lack of semantic transparency in English grammar 
as the synchronic consequence of a diachronic realignment of form-meaning 
mappings resulting from case syncretism (ibid.: 123, citing Sapir 1921), i.e. again 
from the drift towards the invariable word. At the same time, van Haeringen’s 
close comparison of Dutch, German and English challenged any too sweeping 
categorisations in holistic typology. First, Dutch resists a straightforward syn- 
chronic classification as either synthetic or analytic; in fact, it does so to such an 
extent that van Haeringen (1956: 36) labels it “artistically unsystematic” (artistiek 
onsystematisch). Second, although van Haeringen (ibid.: 22-23) adopts the tradi- 
tional view that the reduction of final syllables as observed in ‘analytic crumbling’ 
is diachronically linked to the fixation of Germanic word accent on the first syllable, 
he also points out that the typological status of Dutch casts doubt on any straight- 
forward causal, indeed mechanical relationship between, on the one hand, the 
fixation of word accent or the resulting reduction of morphological richness, and 
compensatory developments in the realm of syntax on the other hand (ibid.). He 
therefore leaves open the possibility of a reverse causal relationship, with greater 
restrictions on word order potentially creating room for morphology to become 
redundant (ibid.; see Hiining 2006 for a more detailed analysis of van Haeringen’s 
account and its place in the history of linguistics). From the perspective of modern 
historical linguistics, compensatory developments involved in ‘analytic crumbling’ 
invite an explanation in terms of grammaticalisation, a process which in many 
cases led to the replacement of cognate synthetic structures with language-specific 
analytic ones in West Germanic. Examples are the rise of auxiliaries fulfilling func- 
tions associated with verbal morphology (e.g., Landsbergen 2006; Poortvliet 2016) 
and of prepositions replacing case endings (e.g., van der Wouden 2006). 

Apart from identifying and comparing structures based on functional equiva- 
lence, some research has tried to link diachronic variation to aspects of linguistic 
cognition, including factors like processing efficiency and linguistic complexity 
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(Hawkins 2004). Deeper functional or cognitive explanations of cross-linguistic 
variation and change figure increasingly in computational simulations of language 
change, such as Van Trijp’s (2013) study of the effects of cue reliability, processing 
efficiency and ease of articulation on syncretism in the German definite article, 
and Pijpops/Beuls/Van de Velde’s (2015) study ofthe rise ofthe weak preterite in 
Germanic. Some factors are rooted in the social environment in which language is 
used. For instance, referring to work by Thomason/Kaufman (1988) on English and 
Boyce Hendriks (1998) on Dutch, Weerman (2006) hypothesizes that deflection in 
West Germanic languages intensified in periods of language contact, when there 
were more L2 learners. 

The three explicitly diachronic articles in the present collection illustrate the 
most recent developments in the field. Mirjam Schmuck’s comparison of the use of 
the definite article in German, Dutch and English shows that the German article’s 
functional domain has been expanding into generic usages and combinations 
with proper nouns, suggesting a more advanced grammaticalisation process 
than in Dutch and English. While confirming the position of Dutch between 
German and English, Schmuck’s account stands out because in this case it is Ger- 
man grammar that allows the more progressive options within West Germanic, 
casting doubt on any straightforward characterisations of German as a conser- 
vative language. The article by Joachim Kokkelmans uses the diachronic compar- 
ative perspective to relate s-retraction in /rs/ clusters, a well-known phonological 
development in Middle High German, to a broader typological feature of the lan- 
guage. By extending his scope to include non-standard varieties of German, Dutch 
and English, and indeed data from beyond (West) Germanic, Kokkelmans links 
s-retraction to the general development of sibilant inventories, which are more 
conservative in Dutch and Low German than in varieties having previously phone- 
micised /f/ as a second sibilant. Finally, Jessica Nowak’s article on the sentence- 
internal capitalisation of nouns shows how the diffusion of innovations across 
German and Dutch, although driven by linguistic factors (i.e. initially emphatic 
and/or honorific use, then animacy and concreteness of the referent), is linked to 
cultural contact and standardisation processes. 


3 Psycholinguistic perspectives 


Whereas the synchronic and diachronic papers in this volume are concerned 
with the analysis and explanation of contrasts and changes in surface structure, 
the psycholinguistic papers employ CL in the explanation of human behavior 
(Gardner 1985; Tervoort et al. 1987). Psycholinguistics, a multidisciplinary field, 
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came into being in the 1950s with the rise of cognitive science, which aims to 
“characterize human knowledge - its forms and content — and how that know- 
ledge is processed, acquired used and developed” (Gardner 1985). Human lan- 
guage can be regarded as a cognitive system (Sloan Foundation 1978) that is either 
treated as universal and relatively autonomous (Chomsky 1980; Pinker 1994) or 
as closely interrelated with and mutually affected by other processes like cogni- 
tion, consciousness, experience, embodiment, brain, self, and human interaction 
(Tomasello 2003; Robinson/Ellis 2008). 

After an early surge of empirical studies on language and color perception in 
the 1950s and 1960s (see Gentner/Goldin-Meadow 2003; Everett 2013; Athanaso- 
poulos/Bylund/Casasanto 2016 for overviews), issues of language-and-cognition 
have again become an area of active investigation over the past few decades. 
Semantic analyses carried out in the 1970s by Talmy (1975), Langacker (1976), 
Bowerman (1980) and others brought to light major differences in the way lan- 
guages carve up the world, not only in the domain of color terms but also, for 
example, through spatial prepositions (Gumperz/Levinson (eds.) 1996) and gram- 
matical aspect (Comrie 1976). Follow-up studies based on acquisition data or psy- 
cholinguistic experiments showed that some of this typological diversity carries 
over to sets of related languages (see e.g., Garnham et al. 2016 on gendered articles 
and nouns in European languages; Coventry et al. 2018 on spatial prepositions), 
including pairs of Germanic ones (e.g., Athanasopoulos/Bylund 2013 on aspect 
in Swedish and English; and Mills 1986 on grammatical gender in German and 
English). This diversity was taken by some to imply a refutation of the universalist 
view of language and conceptual structure, and by others as an indication that 
semantic and conceptual structure operate independently of one another (see 
above). This debate is still unresolved today. While empirical data provide little 
support for universalist views of language and conceptual structure (Dabrowska 
2015; Ibbotson/Tomasello 2016), some authors continue to argue in favor of uni- 
versalist stances (Everaert et al. 2015; Boxell 2016). 

Bilinguals, a term used here to refer to any individuals employing multiple 
languages, started to receive attention as a favorable testing case for effects of 
language on cognition during the 1960s and 1970s. After 1980, bilingualism was 
consolidated as a field of research (see e.g., Baker 1993; Grosjean 1982), and the 
subsequent rise of new empirical methods such as eye-tracking, EEG, and fMRI 
resulted in several volumes also addressing non-linguistic behavior in bilinguals 
(Kroll/De Groot 2005; Pavlenko 2014). In addition to studies comparing L1 and 
L2 production, empirical studies with behavioral measures (memory accuracy, 
speed of reaction, eye movement) have documented cognitive effects associated 
with bilingualism in certain conceptual domains (e.g., Koster/Cadierno 2018 on 
recognition memory for object position in German/Spanish placement events). 
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In line with the topic of the present volume, all contributions in the psycho- 
linguistic section focus minimally on German and Dutch, and some on additional 
languages as well. Leah Bauke examines whether L1 verb-second word order 
affects how German, Dutch and Norwegian learners respond to a grammaticality 
judgment task in L2 English. Her data reveal a representational conflict in terms of 
competing grammars, with Norwegian of English learners behaving differently 
from Dutch and German learners. Gunther De Vogelaer, Johanna Fanta, Greg 
Poarch, Sarah Schimke and Lukas Urbanek examine regional similarities and 
differences in the production and perception of Dutch pronominal gender by 
both Dutch and German speakers. Besides pointing out intra- and cross-linguistic 
differences, their data shows that increased uncertainty with respect to grammati- 
cal gender is leading to a resemanticization of Dutch pronominal gender. Paz 
Gonzalez and Tim Diaubalick examine representations of tense in German and 
Dutch learners of L2 Spanish. They argue that the different options of expressing 
aspect in L1 German or Dutch may have profound effects on L2 tense production. 
Finally, Dietha Koster and Hanneke Loerts provide an up-to-date review of empir- 
ical studies on the perception of gender language in L1 and L2 German and Dutch 
speakers. They identify gaps in psycholinguistic research on the topic and define 
three fields of future inquiry to move the study of language, bilingualism and 
cognition forward. 

Like the earlier parts of the volume, the psycholinguistic section testifies to 
the diversity of present-day contrastive research, addressing questions relating 
to the description and explanation of cross-linguistic differences, the understand- 
ing of patterns found in various L2s, or the language-and-cognition debate. Inter- 
estingly, some contributions address phenomena that were earlier investigated in 
synchronic and/or diachronic research, illustrating the potential of an ever closer 
integration of the three perspectives in the future. The strong cognitive orientation 
of present-day linguistics has increasingly brought psycholinguistic explanations 
for synchronic and diachronic variation into the limelight, and will continue to 
do so. At the same time, future interaction can help bring psycholinguistics “out 
of the lab” (cf. Speed/Wnuk/Majid 2017), with the rich empirical tradition in both 
synchronic and diachronic contrastive research on German, Dutch, English, and 
(West-)Germanic at large lending psycholinguistic theorizing a greater “ecological 
validity”. 


Introduction — 9 


References 


Athanasopoulos, Panos/Bylund, Emmanuel (2013): Does grammatical aspect affect motion event 
cognition? A cross-linguistic comparison of English and Swedish speakers. In: Cognitive 
Science 37, 2. 286-309. 

Athanasopoulos, Panos/Bylund, Emmanuel/Casasanto, Daniel (2016): Introduction to the special 
issue: New and interdisciplinary approaches to linguistic relativity. In: Language Learning 66, 
3. 482-486. 

Baker, Colin (1993): Foundations of bilingual education and bilingualism. (= Multilingual 
Matters 95). Clevedon: Multilingual Matters. 

Bowerman, Melissa (1980): The structure and origin of semantic categories in the language 
learning child. In: Foster, Mary/Brandes, Stanley (eds.): Symbol as sense: New approaches 
to the analysis of meaning. New York: Academic Press. 277-299. 

Boxell, Oliver (2016): The place of Universal Grammar in the study of language and mind: 

A response to Dabrowska (2015). In: Open Linguistics 2. 352-372. 

Boyce Hendriks, Jennifer (1998): Immigration and linguistic change. A socio-historical linguistic 
study of the effect of German and southern Dutch immigration on the development of the 
northern Dutch vernacular in 16th/17th century Holland. Diss. Madison: University of 
Wisconsin-Madison. 

Braber, Natalie/McLelland, Nicola (2010): Combining modal particles in German and Dutch. 

In: Journal of Germanic Linguistics 22. 461-482. 

Chomsky, Noam (1980): Rules and representations. (= Woodbridge Lectures 11). New York: 
Columbia University Press. 

Comrie, Bernard (1976): Aspect. An introduction to the study of verbal aspect and related problems. 
Cambridge: Cambridge University Press. 

Coventry, Kenny/Andonova, Elena/Tenbrink, Thora/Gudde, Harmen/Engelhardt, Paul (2018): 
Cued by what we see and hear: Spatial reference frame use in language. In: Frontiers in 
Psychology 9. 1-14. 

Croft, William (2003): Typology and universals. 2nd edition. Cambridge: Cambridge University 
Press. 

Dabrowska, Ewa (2015): What exactly is Universal Grammar, and has anyone seen it? In: Frontiers 
in Psychology 6. 1-17. 

Everaert, Martin/Huybregts, Marinus/Chomsky, Noam/Berwick, Robert/Bolhuis, Johan (2015): 
Structures, not strings: Linguistics as part of the cognitive sciences. In: Trends in Cognitive 
Sciences 19. 729-743. 

Everett, Caleb (2013): Linguistic relativity: Evidence across languages and cognitive domains. 
(= Applications of Cognitive Linguistics 25). Berlin: Mouton de Gruyter. 

Fischer, Klaus (2013): Satzstrukturen im Deutschen und Englischen. Typologie und Textreal- 
isierung. (= Konvergenz und Divergenz 1). Berlin: Akademie Verlag. 

Fries, Charles Carpenter (1945): Teaching and learning English as a foreign language. 

(= Publications of the English Language Institute, University of Michigan 1). Ann Arbor: 
University of Michigan Press. 

Gaeta, Livio (2015): Kontrastive Linguistik nach der typologischen Wende. In: Germanistische 
Mitteilungen 40, 1. 79-82. 

Gardner, Howard (1985): The mind’s new science. A history of the cognitive revolution. New York: 
Basic Books. 


10 —— Gunther De Vogelaer/Dietha Koster/Torsten Leuschner 


Garnham, Alan/Oakhill, Jane/Von Stockhausen, Lisa/Sczesny, Sabine (2016). Editorial: Language, 
cognition, and gender. In: Frontiers in Psychology 7. 1-3. 

Gast, Volker (2011): Contrastive linguistics: Theories and methods. In: Kabatek, Johannes/ 
Kortmann, Bernd (eds.): Theorien und Methoden der Sprachwissenschaft/Theories and 
methods in linguistics. (= Wörterbücher zur Sprach- und Kommunikationswissenschaft/ 
Dictionaries of Linguistic and Communication Science 11). Berlin: De Gruyter. 

Gentner, Dedre/Goldin-Meadow, Susan (2003): Whither Whorf. In: Gentner, Dedre/ 
Goldin-Meadow, Susan (eds.): Language in mind: Advances in the study of language and 
thought. Cambridge, MA: MIT Press. 3-14. 

Grosjean, Francois (1982): Life with two languages: An introduction to bilingualism. Cambridge, 
MA: Harvard University Press. 

Gumperz, John J./Levinson, Stephen (eds.) (1996): Rethinking linguistic relativity. (= Studies in 
the Social and Cultural Foundations of Language 17). Cambridge: Cambridge University Press. 

Hawkins, John A. (1986): A comparative typology of English and German: Unifying the contrasts. 
London: Croom Helm. 

Hawkins, John A. (2004): Efficiency and complexity in grammars. Oxford: Oxford University Press. 

Hawkins, John A. (2018): Word-external properties in a typology of Modern English: A comparison 
with German. In: English Language and Linguistics. 1-27. https: //doi.org/10.1017/S13606 
74318000060 (last accessed: 22-3-2019) 

Hüning, Matthias (2006): Inleiding. Nederlands, Duits, Engels: Tussen-dimensies. In: Hüning/ 
Verhagen/Vogl/van der Wouden (eds.). 9-18. 

Hüning, Matthias/Verhaegen, Arie/Vogl, Ulrike/van der Wouden, Ton (eds.) (2006): Nederlands 
tussen Duits en Engels. Handelingen van de workshop op 30 september en 1 oktober 2005 
aan de Freie Universität Berlin. Leiden: Stichting Neerlandistiek Leiden. 

Hunter, David/Foolen, Ad (2012): French filling for a Germanic sandwich. A comparative study of 
the French influence on English, Dutch and German vocabulary. In: Leuven Contributions in 
Linguistics and Philology 98. 162-176. 

Ibbotson, Paul/Tomasello, Michael (2016): Evidence rebuts Chomsky’s theory of language 
learning. In: Scientific American 315, 5. www.scientificamerican.com/article/evidence- 
rebuts-chomsky-s-theory-of-language-learning/?redirect=1 (last accessed: 1-7-2019). 

James, Carl (1998): Errors in language learning and use: Exploring error analysis. London: 
Longman. 

Journal of Germanic Linguistics 22, 4 (2010): Special issue on comparative linguistics: Dutch 
between English and German. 

Journal of Germanic Linguistics 28, 4 (2016): Special issue: New directions in comparative 
Germanic linguistics. 

König, Ekkehard (2012): Contrastive linguistics and language comparison. In: Languages in 
Contrast 12. 3-26. 

König, Ekkehard/Gast, Volker (2018): Understanding English-German contrasts. 4th, newly 
revised edition. Berlin: Schmidt. 

Koster, Dietha/Cadierno, Teresa (2018): The effect of language on recognition memory in first 
language and second language speakers: The case of placement events. In: International 
Journal of Bilingualism 23, 2. 651-669. 

Kroll, Judith/De Groot, Annette (2005): Handbook of bilingualism. Psycholinguistic approaches. 
Oxford: Oxford University Press. 

Lado, Robert (1957): Linguistics across cultures: Applied linguistics for language teachers. Ann 
Arbor: University of Michigan Press. 


Introduction — 11 


Landsbergen, Frank (2006). Krijgen, kriegen en get: Een vergelijkend onderzoek naar betekenis- 
verandering en grammaticalisatie. In: Hiining/Verhaegen/Vogl/van der Wouden (eds.). 
259-272. 

Langacker, Ronald (1976): Semantic representations and the linguistic relativity hypothesis. In: 
Foundations of Language 14, 3. 307-357. 

Leontiy, Halyna (ed.) (2013): Multikulturelles Deutschland im Sprachvergleich. Das Deutsch im 
Fokus der meist verbreiteten Migrantensprachen. Ein Handbuch fiir DaF-Lehrende und 
Studierende, für Pädagogen und ErzieherInnen. (= TransLIT 1). Münster: LIT. 

Leuschner, Torsten (2006): Nederlands tussen Duits en... Zweeds. Grafonemische afkortingen 
(Kurzwörter) in taalvergelijkend perspectief. In: Hiining/Verhaegen/Vogl/van der Wouden 
(eds.). 141-162. 

Leuvense Bijdragen (Leuven Contributions in Linguistics and Philology) 98 (2012). 

Leuvense Bijdragen (Leuven Contributions in Linguistics and Philology) 101 (2017). 

Levinson, Stephen (1996): Introduction to Part Il. In: Gumperz/Levinson (eds.). 133-144. 

Mills, Anne E. (1986): The acquisition of gender: A study of German and English. Berlin: Springer. 

Pavlenko, Aneta (2014): The bilingual mind: And what it tells us about language and thought. 
Cambridge: Cambridge University Press. 

Pijpops, Dirk/Beuls, Katrien/Van de Velde, Freek (2015): The rise of the verbal weak inflection in 
Germanic: An agent-based model. In: Computational Linguistics in the Netherlands Journal 5. 
81-102. 

Pinker, Steven (1994): The language instinct. New York: Morrow. 

Poortvliet, Marjolein (2016): Copy raising in English, German, and Dutch: Synchrony and 
diachrony. In: Journal of Germanic Linguistics 28. 370-402. 

Ringbom, Hakan (1994): Contrastive Analysis. In: Asher, Ronald E./Simpson, Jacqueline M.Y. 
(eds.): The encyclopedia of language and linguistics. Vol. 2. Oxford et al.: Pergamon. 
737-742. 

Robinson, Peter/Ellis, Nick (2008): Handbook of cognitive linguistics and second language 
acquisition. New York: Routledge. 

Ruigendijk, Esther/Van de Velde, Freek/Vismans, Roel (2012): Germanic sandwich 2010: Dutch 
between English and German. In: Leuvense Bijdragen (Leuven Contributions in Linguistics 
and Philology) 98, 1. 1-3. 

Sapir, Edward (1921): Language. An introduction to the study of speech. New York: Harcourt Brace. 

Simon, Ellen/Leuschner, Torsten (2010): Laryngeal systems in Dutch, English and German: A 
Contrastive-Phonological Study on Second and Third Language Acquisition. In: Journal of 
Germanic Linguistics 22. 403-424. 

Sloan Foundation (1978): Cognitive science 1978. Report of The State of the Art Committee to 
The Advisors of The Alfred P. Sloan Foundation. http://csjarchive.cogsci.rpi.edu/misc/ 
CognitiveScience1978_OCR.pdf (last accessed: 1-7-2019). 

Speed, Laura/Wnuk, Ewelina/Majid, Asifa (2017): Studying psycholinguistics out of the lab. In: 
De Groot, Annette/Hagoort, Peter (eds.): Research methods in psycholinguistics and the 
neurobiology of language: A practical guide. New York: Wiley. 190-207. 

Talmy, Leonard (1975): Semantics and syntax of motion. In: Kimball, John (ed.): Syntax and 
semantics. Vol. 4. New York: Academic Press. 181-238. 

Tervoort, Bernard/Prins, Ron/Van lerland, Margreet/Appel, René (1987): Inleiding tot de 
psycholinguistiek. Muiderberg: Coutinho. 

Tomasello, Michael (2003): Constructing a language. A usage-based theory of language 
acquisition. Cambridge, MA: Harvard University Press. 


12 — Gunther De Vogelaer/Dietha Koster/Torsten Leuschner 


Thomason, Sarah Grey/Kaufman, Terrence (1988): Language contact, creolization, and genetic 
linguistics. Berkeley: University of California Press. 

Vandepitte, Sonia/De Sutter, Gert (2013): Contrastive linguistics and translation studies. In: 
Gambier, Yves/van Dorselaer, Luc (eds.): Handbook of translation studies. Vol. 4. 
Amsterdam/Philadelphia: Benjamins. 36-41. 

van der Auwera, Johan (2012): From contrastive linguistics to linguistic typology. In: Languages 
in Contrast 12. 69-86. 

van der Auwera, Johan/Gast, Volker/Vanderbiesen, Jeroen (2012): Human impersonal pronoun 
uses in English, Dutch and German. In: Leuvense Bijdragen (Leuven Contributions in 
Linguistics and Philology) 98, 1. 27-64. 

van der Wouden, Ton (2006): Nederlandse voorzetsels tussen Duitse en Engelse. In: Hüning/ 
Verhaegen/Vogl/van der Wouden (eds.). 183-206. 

van Haeringen, Coenraad Bernardus (1956): Nederlands tussen Duits en Engels. The Hague: 
Servire. 

van Trijp, Remi (2013): Linguistic selection criteria for explaining language change: A case study 
on syncretism in German definite articles. In: Language Dynamics and Change 3, 1. 105-132. 

Vismans, Roel/Wenzel, Veronika (2012): Dutch between English and German: Language 
learners’ perceptions of linguistic distance. In: Leuvense Bijdragen (Leuven Contributions 
in Linguistics and Philology) 98, 1. 4-26. 

Wardhaugh, Ronald (1970): The Contrastive Analysis Hypothesis. In: TESOL Quarterly 4, 2. 
123-130. 

Weerman, Fred (2006): It’s the economy, stupid! Een vergelijkende blik op men en man. In: 
Hüning/Verhaegen/Vogl/van der Wouden (eds.). 19-46. 


Part 1: Synchronic Perspectives 


Sebastian Kürschner (Eichstätt) 

Nickname formation in West Germanic: 
German Jessi and Thomson meet Dutch Jess 
and Tommie and English J-Bo and Tommo 


Abstract: German, Dutch, and English nickname formation is examined using a 
contrastive corpus of nicknames which were found in the online profiles of ama- 
teur athletes and are compared with the same individuals’ first and last names. 
We study the word formation and word creation of nicknames, either based on 
the athletes’ legal names or coined freely, pointing out parallels and divergences 
between the three languages. Two prototypes are identified crosslinguistically as 
relevant bases for output schemas: disyllabic trochees ending in -i (cf. German 
Conni, Dutch Passie, English Thanny) and monosyllabics ending in a closed syllable 
containing a single sonorant (Sash, Bous, Maze). These structures are then inter- 
preted in terms of preferred sound patterns and sex marking. Dutch turns out in 
many respects to hold an intermediate position between German and English. 


Zusammenfassung: Anhand kontrastiver Daten zum Deutschen, Niederländi- 
schen und Englischen wird der Bildung von Spitznamen nachgegangen. Grund- 
lage des Korpus sind Spitznamen von Amateursportlerinnen und -sportlern, die 
internetbasiert anhand von Steckbriefen erhoben wurden und mit den Ruf- und 
Familiennamen der betreffenden Personen abgeglichen werden. Anhand der Wort- 
bildungen und -schöpfungen auf Basis der offiziellen Namen sowie der freien 
Schöpfungen werden Parallelen und Divergenzen von Spitznamen in den drei Spra- 
chen herausgearbeitet. Zwei Prototypen werden sprachübergreifend als Grundlage 
von Output-Schemata identifiziert: zweisilbige, trochäische Namen auf -i (vgl. dt. 
Conni, nl. Passie, engl. Thanny) sowie Einsilber auf geschlossene Silbe mit einfa- 
chem Sibilanten (Sash, Bous, Maze). Die Daten werden in Hinblick auf Lautstruk- 
turpräferenzen und Geschlechterkennzeichnung interpretiert. Das Niederländische 
nimmt dabei in vielerlei Hinsicht eine mittlere Stellung zwischen Deutsch und Eng- 
lisch ein. 


ð Open Access. © 2020 Kürschner, published by De Gruyter. JEMAAH This work is licensed under 
the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. 
https: //doi.org/10.1515/9783110668476-002 
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1 Introduction 


In addition to alegal name, people usually bear a couple of unofficial names, some 
of which may be characterized as nicknames.’ In this chapter, we examine per- 
sonal nicknames based on first names such as German Jessi < Jessica or Thomson 
< Thomas, and nicknames based on last names such as Dutch Hoegie < Hoegarts, 
Siem < Simons. We also consider freely coined nicknames, cf. English Ders, Loofa. 

The objective of this chapter is to identify parallels and divergences in the 
formation of nicknames in three closely related West Germanic languages, viz. 
German (G.), Dutch (D.), and English (E.), based on a comparable set of data. The 
data stems from amateur athletes’ internet profiles and was gathered analogously 
for the three languages. We compare the data in terms of the broad variation and 
the distribution of frequencies where the patterns observed in nickname formation 
are concerned. 

As the examples in the title show, different kinds of nicknames are formed 
based on first names. G. Jessi and Thomson, D. Jess and Tommie, and E. J-Bo and 
Tommo are all based on Jessica (in the case of J-Bo also integrating the beginning of 
the last name Bowden) and Thomas or Tom, respectively. There is thus variation 
in the formation of nicknames between these languages. However, most of these 
forms could just as likely stem from the other two languages, thereby indicating 
parallels between them as well. 

Our knowledge of nicknames differs between the three languages: While 
monograph studies and other publications exist regarding G. (cf. Kany 1992; 
Naumann 1976, 1977) and E. (cf. Morgan/O’Neill/Harré 1979; Busse 1983; de Klerk/ 
Bosch 1996, 1997; Starks/Leech/Willoughby 2012), our main insights into D. stem 
from studies on specific dialects (cf. Leys 1968; Mennen 1994; Van Langendonck 
1978), while no systematic studies on Standard Dutch have been found. The present 
study seeks to tackle this deficiency. At the same time, contrastive studies of nick- 
names are rare, and we therefore wish to provide new information on nicknames 
in the three languages and their current relation from a contrastive perspective. 
We focus entirely on phonological and morphological aspects, since these have 
proven particularly relevant in earlier studies (cf. Naumann 1976, 1977 for G.; 
Taylor-Leech/Starks/Willoughby 2015 for E.). 


1 Thanks are due to two anonymous reviewers, to Pia Fischl, Erik Lutz and Patricia Rawinsky 
for their valuable help in data collection and preparation, and to Paul Gahman and Torsten 
Leuschner for comments and proofreading. 
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2 Nicknames: definitions and characteristics 


Nicknames are usually defined as being a) bound to an individual in addition to her 
or his legal name, b) specific to a certain group of people with which an individual 
regularly interacts (e.g., aschool class, a sports team, a choir, etc.), c) not suitable 
for legal or outsider use, and d) usually chosen by other people, i.e., not self-given 
(cf. Nübling/Fahlbusch/Heuser 2015: 171-172). Apart from these main characteris- 
tics, no consensus has been reached for the definition of nicknames (cf. Brylla’s 
2016 handbook chapter discussing the lack of acommon terminology in Germanic 
linguistics and even within the linguistics of specific Germanic languages). In 
certain definitions, for instance, the use of the term nickname is restricted to 
bynames, which usually stem from the lexicon and are based on a relation to the 
name bearer’s person, physique, lifestyle, etc. (e.g., Smiley, Angry), while forma- 
tions based on the individual’s legal name are regarded as so-called pet names.? 
Other definitions offer different categories. Lawson (1973) separates so-called short 
names (such as Dave < David) from what he calls nicknames with an affective 
suffix (such as Davey), and corroborates this separation by asserting that different 
stereotypes of the two types of names exist: While most short names are associated 
with positive values (‘good’, ‘active’, ‘strong’) even more than the corresponding 
full forms, the derived nicknames are rated comparatively low according to these 
values.? 

While problems in the definition of nicknames will remain, we use a broad 
definition of the term, following a recent G. textbook on onomastics (Nübling/Fahl- 
busch/Heuser 2015: 172). We define a nickname on emic grounds as what is defined 
as such by G., D., and E. language users. As will be shown below, the data reveals 
that a very comparable and broad definition of nicknames is accepted among 
language users, including bynames, modifications of existing (legal) names, and 


2 As shown below, this definition does not match what linguistic laymen consider nicknames. 
This is corroborated by other students of nicknames such as Starks/Leech/Willoughby (2012: 140), 
who suggest “that researchers who ignore variants of names as nickname types fail to consider 
the views of large numbers of individuals who see variants of names as nicknames”. In fact, in 
many of the existing studies on G. and E. approximately 50-60% of the nicknames collected in 
sample and survey studies are based on the individuals’ legal names. This tendency is in line 
with our data, see section 4. 

3 Note that a repetition of the study might result in different outcomes given the changes in 
society over the past 44 years. Other distinctions, like that of Van Buren (1977), further contribute 
to the terminological confusion: in his terms, forms like Dave are nicknames and forms like 
Davey are affectionate nicknames. Cf. the discussion in Wierzbicka (1992: 225-237) who sheds 
doubt on the appropriateness of such classifications. 
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newly coined names without an overt basis in existing words or names. With our 
emic definition, we provide a widely comparable set of nicknames that are based 
on a commonly accepted (and mostly parallel) concept of nicknames in the socie- 
ties from which the data stems. 

Note that the term nickname (or G. Nick for short) also appears in names spe- 
cific to internet uses in forums, chats, games, etc. (cf. Gkoutzourelas 2015; Kaziaba 
2016). These names are different from nicknames according to our definition, mainly 
because they are specifically chosen by the individuals themselves for use online. 


3 Data and methodology 


To provide comparable sets of data, samples of nicknames were collected from 
websites in an analogous fashion for G., D., and E. We found that clubs connect- 
ing a group of people such as sports clubs, choirs, youth associations etc. often 
offer lists of their members’ personal profiles. Such personal profiles provide a 
common source of nicknames, particularly in sports teams since nicknaming in 
team sports is “one way of fostering team spirit” (Chevalier 2004: 128). Nicknames 
thus serve a special integrating function within teams and are often uniquely used 
by the team members. In-group interaction through nicknames was similarly 
reflected in online communication via internet profiles. Since this observation 
held for all three language communities examined, we chose to collect nicknames 
from athletes’ online profiles. 

Personal profiles usually consist of systematic information collected in a 
team-internal survey. The athletes are asked to provide personal information 
about specific categories usually including first and last name, age, occupation, 
position played, and other personal information like hobbies. For this study, only 
personal profiles that had “nickname” as a category were considered. Such profiles 
and the corresponding teams were identified via online queries containing the term 
nickname in the respective language (G. Spitzname, D. bijnaam) in combination 
with search terms like team, soccer, basketball etc. Per profile, information on the 
nickname, first name, last name, sex, and location were extracted into a database. 

In order to obviate nicknames from children,* we ignored nicknames from 
children’s teams and used nicknames from young adult (starting from approxi- 


4 Studies with individuals at differing ages showed that the bases and forms of nicknames 
change with age during childhood and adolescence, cf. Kany (1999), Morgan/Leech/Willoughby 
(1979), Naumann (1976, 1977). 
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mately age 16) and adult teams only. Professional teams were also left out of 
consideration because they may employ nicknames that were not coined within 
the team but in the media. Additionally, only nicknames that deviated from the 
official first and last name were included in the dataset. We included only one 
token per nickname type, unless the nickname type referred to differing legal 
names; thus Em is listed twice as the short form of either Emma or Emily, whereas 
four other cases of Em for Emily were deleted from the list. For each language, 
several hundred nicknames were collected as equally as possible across men and 
women. Table 1 shows the exact number of nickname types collected for each 
language. 


Table 1: Number of nicknames per language and across sexes 


German Dutch English 
Sex All Sex All Sex All 
M F M F M F 


Number of Nicknames 415 335 750 323 320 643 567 413 980 


Since the data were collected using major internet search tools such as Google, 
they constitute a random collection of names. This is also reflected in the varia- 
tion of the number of nicknames per team, the variation of sports included (with 
soccer teams being the main source in all three samples), and the regional and 
national distribution (German and Austrian for G.; Dutch and Belgian for D.; US 
and UK for E.).® 

The analysis is predicated on the assumption that the athletes provided their 
nicknames themselves or at least gave consent to publishing them on their team’s 
website. The collection therefore consists of nicknames that the bearers were 


5 The specific algorithms in such search tools provide the basis ofthe URLs returned. Therefore, 
the collection may not be fully random. However, we used several different search tools and 
a broad variation of search terms, and the results reflect no identifiable patterns related to the 
use of specific search engines. We therefore assume that biases caused by the algorithms are 
negligible. 

6 Note that the data is not suitable for comparing national distributions to the same extent. The 
G. sample stems mainly from Germany, with only 39 of 750 entries from Austria. The D. sample is 
mainly from the Netherlands, with only 31 of 643 entries from Belgium. The E. data, by contrast, 
is distributed more or less equally across British (462 entries) and US websites (518 entries). 
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aware of and accepted as positive nicknames. Derogatory names or nicknames 
evaluated negatively by their bearers for any other reason are unlikely to appear 
in the material and would demand a different approach. 

Despite the limited number of nicknames in the database, the data can be 
considered representative of (positive) nicknames in current amateur sports teams. 
They provide the foundation for studying structural characteristics of nicknames 
and comparing them across the three languages. Unlike many other studies on 
nicknames (most of which were based on survey or experimental data), we will be 
able to discuss the social aspects behind nicknames to a limited extent only since 
background and context information about the nicknames’ origin and use was 
not on hand. However, apart from geographical information, we have reliable 
information about the nickname bearer’s sex, which has been identified as par- 
ticularly relevant in earlier studies. 

Section 4 will introduce the spectrum of nicknames in our data and identify 
those parts of the dataset that are suitable for identifying structural characteristics. 
The structural analyses themselves are presented in the subsequent sections 5-7. 


4 The spectrum of nicknames 


Whereas some nicknames are based on a person’s legal name, others do not 
formally resemble their legal name at all; both types are found in the data. Within 
these two categories more specific subtypes may be differentiated as shown in 
Figure 1. 


nickname 

based on legal name not based on legal name 
firstname lastname first and last name lexicon onomasticon creation 
Becs < Rebecca Matty J < Matthew Jensen Spock 

Hammy < Hamilton Tree Trems 
G.: 58.9% 23.2% 2.0% 7.2% 6.0% 2.7% 
D.: 53.0% 13.7% 3.1% 18.4% 11.7% 0.2% 
E.: 41.9% 25.8% 8.4% 15.5% 7.4% 0.9% 


Fig. 1: The spectrum of nicknames 
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The relative frequencies per language show that the number of nicknames based on 
legal names is higher than that of freely coined nicknames in all three languages, 
with most nicknames being based on first names. The legal name as a base is 
particularly strong in G. (84.1%) while D. and E. leave more room for other types 
(69.8% and 76.1% based on legal names, respectively). The simultaneous use of 
both parts of the legal name as a basis for the nickname is notably more frequent in 
E. (8.4%) than in G. and D., with a high number of nicknames formed as acronyms 
of the legal name (e.g., AB < Alex Brown). 

First names form the predominant base for nicknames in all three languages. 
By contrast, the use of last names as a base for nicknames varies among the three 
languages: D. uses such nicknames to the lowest extent (13.7%), while G. (23.2%) 
and E. (25.8%) exhibit higher frequencies. Interestingly, last names are used far 
more frequently as a base for male than for female nicknames in all three lan- 
guages (G.: M 34.0% F 9.9%; D.: M 19.8%, F. 7.5%; E.: M 32.3%, F 16.9%).’ This 
is, of course, not an inherent characteristic of nicknames but an effect of culture: 
last names are more strongly associated with males rather than females in all 
three language communities because of a long patriarchal history of familial 
names being inherited along the male line.® An additional sex-based difference is 
observed regarding the use of nicknames that are not based on legal names: 
these are consistently associated more frequently with men than with women (G.: 
M 21.0% F 9.6%; D.: M 39.0% F 21.3%; E.: M 31.6% F 13.3%).? 


7 M= male, F = female. 

8 There is a striking difference between the British and the American data concerning the use 
of first and last names. While in the British data nearly as many nicknames are based on first 
names (34.0%) as on last names (33.5%), first names as a base of nicknames are clearly dominant 
in the US (49.0% vs. 18.9%). The difference in nicknames based on last names with respect to sex, 
however, is more pronounced in the American data (M 27.2% F 11.2%) than in the British data 
(M 36.3% F 27.6%); this is in conformity with American studies by Busse (1983), amongst others. 
Note that a parallel sample of Swedish nicknames showed that last names are used more or less 
equally as a base of nicknames for both sexes in Swedish (cf. Kürschner 2014). This is in contrast 
to the languages considered here. 

9 De Klerk/Bosch (1996) found that female nicknames are much more often coined by family 
members or retained from childhood and then adopted by fellow peers. This is in stark contrast 
to male nicknames, which are much more often coined within specific peer groups. Since nick- 
names within families are often coined based on the individual’s first name, this could explain 
why such nicknames are more often found among women than among men. Among men, in 
contrast, there is a higher chance for newly coined nicknames to be based on personal, physical, 
or contextual characteristics or on the person’s last name. 
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In subsequent sections, parallels and divergencies in the formation of nick- 
names are analysed according to phonological and morphological characteristics. 
We describe what new, freely coined nicknames look like, assuming that nearly 
anything is possible in the formation of nicknames. In order to prevent the results 
from reflecting inherent characteristics of lexemes, we restrict ourselves to cases 
in which the product of nickname formation is truly free from the limitations of 
the lexicon. For this reason, we base our analyses on nicknames that stem from the 
processes of word formation or word creation only, ignoring all nicknames that 
are transferred from the existing lexicon or onomasticon. We therefore exclude 
nicknames that are homonymous with lexical items (like Son or Bird), unless they 
are the result of formation processes such as clipping (e.g., Mass < Massey).'® 
Also excluded are nicknames that are overtly identical to existing lexical items 
(e.g., Lizard < Liz, Strudel < Strudwick) or include the legal name in a syntagmatic 
construction (Geiger the Tiger < Geiger). Additionally, we exclude nicknames 
homonymous with existing names (Spence < Spencer), including those of well- 
known people (Tom Hanks), figures (Ali Baba), products (Q-tip), and the like. 
Table 2 shows the resulting number of nicknames used for the analyses provided 
in the subsequent chapters. 


Table 2: Reduced data used for phonological and morphological analyses 


German Dutch English 
Sex All Sex All Sex All 
M F M F M F 


Number of nicknames 249 207 456 126 149 275 249 224 473 


In sections 5-7, we provide an exploratory analysis of the data. We elaborate 
observations derived from a thorough review of the data, including frequency 
measures of observed patterns. Section 5 first examines syllabic characteristics to 
illuminate which syllable numbers, syllable types and segmental features shape 
nicknames in the three languages. Next, in section 6 we investigate the morpho- 
logical mechanisms behind nickname formation. The creation of free forms is 
described in section 7. Finally, section 8 presents the results of the contrastive 
analysis. 


10 In such cases, the relation to the lexicon is considered secondary, without knowing whether 
it was intended in the first place. 
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5 Syllabic characteristics of nicknames 


Nickname formation is a very creative process in language which enhances our 
understanding of the shape of possible words: 


Nicknames, because they act as an avenue for creativity and the expression of some of the 
pure enjoyment that the sounds and meanings of words can give, provide name-users and 
name-bearers with considerable freedom in manipulating and bending linguistic resources. 
They provide evidence of the ongoing enjoyment that human beings find in playing with 
language and creating new words which experiment with patterns of sounds. (de Klerk/ 
Bosch 1997: 293) 


In other words: nicknames show what words can look like without (or nearly with- 
out) the restrictions imposed by lexical patterns. Changes in other lexical items, by 
contrast, reflect general constraints on processes of language change; loan words 
fail to be revelatory in this respect, and new word-formation products are restricted 
by word formation processes. Nicknames therefore provide insights into the poten- 
tial structure of entirely new words (cf. Kürschner 2018)." In order to determine the 
spectrum of nicknames in the three languages, their identifying characteristics, 
and whether they differ between the sexes,” their syllabic characteristics will now 
be explored. An in-depth study of more specific sound patterns would be valuable 
for each of the languages examined, but cannot be provided here. 

The mean length of the nicknames in our sample is two syllables or less in all 
three languages (G: 2.0; D.: 1.7; E.: 1.8 syllables). The legal names from which nick- 
names are derived are on average longer than the associated nicknames (G.: 1.2; 
D. and E.: 1.3 times longer than the corresponding nicknames), with female names 
on average being more readily shortened than male ones. The reason for this 
distribution is that male first names are generally shorter than female first names 
(cf. Whissell 2001: 108 on E.; Nübling 2012 on G.). In D. and E., female nicknames 
also tend to be slightly shorter than male ones (D: F 1.6; M 1.9; E.: F 1.7; M 1.8 sylla- 
bles). In fact, there are more monosyllabic female nicknames in D. and E. than 
disyllabic ones, whereas disyllabic structures are more clearly favoured in male 
nicknames (cf. Table 3, which compares the number of syllables in nicknames). 
This tendency in D. and E. not only contradicts many earlier studies which found 


11 Other valuable data of this kind are provided by short words (Ronneberger-Sibold 1995) and 
product names (Ronneberger-Sibold/Wahl 2013). 

12 Cf. Cutler/McQueen/Robinson (1990) on E., Oelkers (2003) and Niibling (2012) on G. Their 
work has shown that sound patterns assist in the association between names and their possible 
bearers’ sex, which might be relevant for nicknames, too. 
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that shorter names (specifically monosyllabics, cf. Elsen 2016: 121) are typically 
masculine and longer names feminine (cf. de Klerk/Bosch 1996: 536-539), but 
also contrasts starkly with G.: while a tendency towards disyllabic nicknames can 
be observed across all three languages, D. and E. use monosyllabic nicknames 
much more readily than G., where disyllabic nicknames are used extensively. In 
the following sections, the structures of the two frequent groups, viz. mono- and 
disyllabics, are presented in detail. 


Table 3: Syllable length in nicknames 


Syllable German Dutch English 
length 
Sex All Sex All Sex All 
M F (456) M F (275) M F (473) 
(249)? (207) (126) (149) (249) (224) 
1 10.4% 11.6% 11.0% 20.6% 49.0% 36.0% 30.9% 47.5% 38.8% 
2 80.7% 78.3% 79.6% 65.1% 46.3% 54.9% 59.4% 42.6% 51.5% 
3 6.0% 7.7% 6.8% 13.5% 3.4% 8.0% 6.8% 6.7% 6.8% 
4 2.8% 2.4% 2.6% 0.8% 1.3% 1.1% 2.4% 1.8% 2.1% 
5 or more = = = = = = 0.4% 1.3% 0.8% 


5.1 Structural aspects of monosyllabics 


Monosyllabic nicknames are mostly products of shortening. Since the sound pat- 
terns specific to nicknames are of particular interest here, the parts of the names 
that do not simply reflect the characteristics derived from the legal names form 
the core of the current discussion. In shortening processes, the number of clipped 
sounds is unpredictable. Since shortening mostly affects the end of the respective 
base (end clippings, cf. section 6.2 below), we focus on final sounds.“ The three 
languages show a parallel tendency towards closed, i.e. consonant-final, sylla- 


13 Percentages are used in this and the following tables to assure comparability. The number of 
items analyzed per category is provided in parentheses with the column names. 

14 Final sounds often reflect a sound provided by the base (unless a suffix is added) and thus a 
characteristic thereof. However, when a nickname is coined, a choice is made with regard to a 
new final sound, which is reflected in the nickname. For instance, for Kerstin back clippings can 
result in, among others, Kersti, Kerst, Kers, and Ker. These each provide a different final sound, 
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bles in monosyllabic nicknames, cf. Table 4. This tendency is more pronounced in 
D. and E. than in G., where open syllables are more often used. However, we find 
a sex-based distribution in G., as closed syllables are especially frequent in male 
monosyllabics (Det < Detlef), while female nicknames of this kind much more 
often appear with a final vowel (Co < Corinna, Bo < Borton vs. Ker < Kerstin). This 
corroborates studies on first names showing that male names tend to end in 
closed syllables, while female names tend to end in open syllables (cf. Nübling 
2012: 333-334; Oelkers 2003: 185-189) and that monosyllabics generally bear a 
male connotation (cf. Nübling 2012: 345-346; Oelkers 2003: 145-151, for E. also 
Cutler/McQueen/Robinson 1990: 475-478). 


Table 4: Syllable types in monosyllabics 


Syllable German Dutch English 
type 
Sex All Sex All Sex All 
= S (510) zan aO ON a za. ú 
M F (50) F F (99) i 5 (183) 
(26) (24) (26) (73) (77) (106) 
Closed 80.8% 58.3% 70.0% 84.6% 84.9% 84.8% 88.3% 79.2% 83.1% 
Open 19.2% 41.7% 30.0% 15.4% 15.1% 15.2% 11.7% 20.8% 16.9% 


Final sounds in closed monosyllabic nicknames are predominantly fricatives. 
While this predominance is readily observable in E. (67.8%), a greater divergency 
in the use of final sounds is evident in G. (42.9%) and D. (38.1%). In E. (52.6%)*° 
just as in D. (25.0%), -s is used most often as the final sound (D. Rens < Renske, 
Jikks < Jikke; E. Becs < Rebecca, Klepps < Kleppe). G. uses a greater variety of 
sounds, with /f/ and /f/ being the predominant choice (171% each, cf. Jansch < 
Janina, Scheuf < Scheufler). Fricatives are slightly more common as final sounds in 
G. and E. male nicknames than in female ones (G.: M 47.6% F. 35.7%; E.: M 72.1% 
F 64.3%). The opposite tendency can be found in D. (M 18.2% vs. F 45.2% fricatives), 
where 45.5% of all male nicknames end in a sonorant and 36.4% in a stop; this 


amongst which the nickname creator can freely choose. By contrast, when considering the 
beginning of nicknames, only the structural characteristics of bases’ initial sounds would be 
reflected. 

15 -s is realized as [s] or [z] in E., depending on the sonority of the previous sound. In G. and D., 
it is always pronounced as [s] in the final position. 
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difference coincides with a stronger presence of final -s in female nicknames in 
D. (M 9.1% F 30.6%), while -s is slightly more frequent with male names in E. 
(M 58.8% F 47.6%). 

Studies of sound symbolism have found obstruents to be associated with mas- 
culinity, sonorants with femininity (cf. Whissell 2001: 106). As for nicknames, how- 
ever, our results contradict these associations with respect to D. and E. Not only are 
many female nicknames monosyllabic, D. in particular often uses obstruents to 
create female nicknames, possibly as a playful way to subvert common sex-role 
based associations. This is supported by Wierzbicka (1992: 375-383), who describes 
the onymic suffix -s in Australian English as having an “anti-diminutive” function 
that is often used by adolescent girls: “the speaker wishes to dissociate himself 
or herself emphatically from the kind of emotional attitude associated with 
diminutives” (Wierzbicka 1992: 378). In Wierzbicka’s study, the diminutive function 
is associated with names ending in /i/ which are often retained from childhood. 
Considering that many of the names in our data stem from adolescents, nick- 
naming can be interpreted as a playful manner of dealing with adolescence and 
sex roles. 

The three languages also differ in the number of final consonants in closed 
syllables. In G. and E., monosyllabic nicknames end in a single consonant about 
as often as they end in a two-consonant cluster (G.: 51.4% vs. 48.6%, E.: 52.0% 
vs. 48.0% for single vs. two consonants, respectively), while D. shows a clear 
tendency towards single final consonants in monosyllabic nicknames (82.1% vs. 
17.9%). Across the three languages, clusters appear particularly often with final -s 
and, in G., /f/. In most cases, these sounds provide clusters that run counter to 
the sonority hierarchy. For example, in G. Sabs < Sabrina, D. Rox < Roxanne, and 
E. Lyds < Lydia, the order of sounds in the syllable coda contradicts expectations 
based on sonority: although /s/ is more sonorous than the stop, it is placed behind 
the stop in the syllable coda and thus forms an extra-syllabic element. This may 
evoke an expressive effect, enhance salience, and subvert sex-role stereotypes as 
suggested above. 

Considering vowel-final monosyllabic nicknames, E. shows a strong prefer- 
ence for final /i:/ (Si < Sierra), and /i:/ and /o:/ for G. (Fi < Fiona, Flo < Florian), 
whereas D. does not show any specific tendency at all. 


5.2 Structural aspects of disyllabics 
As Table 5 shows, disyllabic nicknames prototypically end in an open syllable in 


G. Open syllables are also characteristic of, though not as frequent in, the other 
two languages. 
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Table 5: Syllable types and frequent final sounds in disyllabics 


syllable German Dutch English 
type 
-final Sex All Sex All Sex All 
sound m en == — (451 ——— 9 
M F 663) M F Gey) M F (243) 
(201) (162) (82) (69) (148) (95) 
closed 12.9% 14.8% 13.8% 25.6% 23.2% 24.5% 18.2% 34.7% 24.7% 


- fricatives 5.0% 0.0% 2.8% 8.5% 7.2% 7.9% 6.8% 14.7% 9.9% 
-sonorants 7.0% 13.0% 9.6% 14.6% 11.6% 13.2% 8.1% 14.7% 10.7% 


- stops 1.0% 1.9% 1.4% 2.4% 4.3% 3.3% 3.4% 5.3% 4.1% 
open 87.1% 85.2% 86.2% 74.4% 76.8% 75.5% 81.8% 65.3% 75.3% 
--j 50.2% 61.7% 55.4% 43.9% 49.3% 46.4% 58.1% 38.9% 50.6% 
- -[a] 21.4% 8.6% 15.7% 17.1% 15.9% 16.6% - = u 
- -0 8.0% 4.3% 6.3% 4.9% 5.8% 5.3% 13.5% 12.6% 13.2% 


--a 5.0% 9.3% 6.9% 6.1% 4.3% 5.3% 3.4% 4.2% 3.7% 


With respect to the specific final vowel, /i/ is dominant in all three languages 
(G. Fabi < Fabienne, D. Winski < Elwin, E. Welly < Llwelyn). While -i is more com- 
monly present in female nicknames in G., it is particularly predominant among 
male names in E.,'* with D. showing no such tendency. G. and D. also frequently 
have the schwa vowel ([a]) as final sound, whereas this is not used in English 
(G. Tobse < Tobias, D. Ceke < Cedric). In G., male nicknames end in final schwa 
much more frequently than female nicknames, whereas the schwa is used equally 
for both sexes in D. In all languages, -o and -a are next in the list of frequent final 
vowels (G. Mazlo < Mazalovic, D. Roffa < Rovers, E. Haylbo < Haley), both with a 
much lower frequency than -i. However, the frequency of -o is comparatively high 
in E. Section 6 shows that the presence of these final vowels results either from 
clipping (Emi < Emily) or from suffixation (Emi < Emma). 

In E., female nicknames are much more often found with closed syllables than 
male ones, again contrasting with findings on the structure of first names from 
earlier nickname studies (see discussion above in section 5.1). Across the three lan- 
guages, closed syllables in nicknames mostly end in sonorants and only very sel- 
dom in stops. Sonorants predominate abundantly in G. (Heinzen < Hein), and the 
fricative -s is also strongly represented in D. and E. (D. Laris < Larissa, E. Meggers < 
Megan), as indeed with monosyllabics. Any fricatives apart from -s are very rare. 


16 This contrasts strongly with the findings of de Klerk/Bosch (1996) for South African English, 
where final -i is regarded as prototypically feminine. 
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Summing up the results from the analysis of mono- and disyllabics with 
regard to final sounds, G. generally seems to prefer sonorant endings for nick- 
names, while D. and E. tend to end in -s (sonorants apart). In G., a final fricative 
evokes a masculine connotation in nicknames, especially monosyllabics. 

Apart from the final sound, it is interesting to observe how the two syllables 
are linked in disyllabics. The tendency is for the link to consist of a single consonant 
that is either the onset of the second syllable or ambisyllabic. Consonant clusters 
are found in a number of nicknames, most often in D. (36.3%), less so in E. (30.0%), 
and least often in G. (21.2%). Most clusters are remnants from the legal name on 
which the nickname is based. While such clusters are often reduced in G. (54.2% 
of all cases in which the legal name contains a cluster, cf. Wale < Walter) and D. 
(48.9%, cf. Possie < Postulart), this is somewhat less often the case in E. (36.4%, 
Shelly < Shelbi). Cluster reduction is particularly frequent in female nicknames in 
G. and D., where it can be interpreted as a simplification of syllable structure result- 
ing in a CVCV-nickname. Although consonant clusters are a typical phenomenon 
of G. lexical items generally, nicknames clearly deviate from this characterization. 
As a result they appear softer and, if sonorants are involved, more sonorous.” 

On the other hand, when no cluster is present in the legal name, new clus- 
ters are occasionally created, potentially contributing to expressivity. Clusters 
appearing more than once are combinations of a voiceless stop and a sibilant in 
G. (Britschi < Britta, Natze < Nadine, Robser < Robin), combinations of a nasal and 
a voiceless stop in D. (Dompie < Dom, Jonko < Jon), combinations of a consonant 
and a sibilant in E. (Bailzo < Baillie, Natson < Natalie); G. and E. thus share the use 
of sibilant-final clusters. Apart from these clusters, G. and D. form clusters that 
are also typical of diminutives; these reflexes of morphology are discussed below 
in sections 6.3-6.4. 


6 Processes of nickname formation 


In this section, we examine morphological and extra-grammatical processes that 
are applied in the creation of nicknames. In analysing these processes, we only 
consider nicknames that are based on legal names. Table 6 provides an overview 
ofthe fundamental processes found in the data. 


17 As Nübling (2012: 342-343) notes, consonant clusters in popular German first names have 
been diminishing since 1945, resulting in softer names. If sonorants are involved, their sonorous 
quality is even more obvious in the absence of other consonants (ibid.: 336-338). 
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Table 6: Nickname formation processes with examples (per process type, first example based 
on first name, second example based on last name) 


process type German Dutch English 
base nickname base nickname base nickname 
acronym formation Marion MP Patricia Pe Helen H 
Peter 
Florian Floka Hannah HB Alex Brown AB 
Kienberger Bikker 
clipping 
- back Julia Jul Marianne Mari Riley Ri 
Kleefeld Klee Touten- Tout Primmer Prim 
hoofd 
- fore Janine Nine Andre Dre Rebekah Bekah 
- edge - - Simone Moon Natasha Tash 
- middle Georg Gorg - - Garret Gart 
suffixation Maik Maiker Jorrit Jorrito Tom Tommo 
Schulz Schulzi Mik Mikkie Berg Bergie 
clipping + suffixation Tobias Tobse Arthur Arti Jordyn Jordo 
Ulbrich Ulle Duijkers — Duiky Hocknell Hockers 
reduplication Teresa Tete Jolanda Jojo Cole Coco 
compounding Karl Partykarl Loes Loesbal Kayla Kaylabug 
Kock Keilriemen Maas Maaskantje Bower Bower 
Kock Power 
blending Alexander Skandalex Romboud Rombocop Mika Mikattack 
Miriam Schmiri - Fuller Fulldog 
Schmitz 
defamiliarization Marco Darco Nico Nocci Shannon Shewan 
and word play Sarah Sarah Flendrie — Flen3 Donagan Dank-sho 
other UIf Mulf Van de Kreeke Ben Been 
Kreeke 
free forms - Pötzi - Sjiemelle - Udzy 


- Tuff - - Guence 
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In order to compare nickname formation in the three languages more closely, 
several processes will be examined in detail along with the frequency with which 
they are applied."® Table 7 outlines the relative number of items associated with 
each of the processes introduced above. 


Table 7: Processes involved in nickname formation 


processtype German Dutch English 
Sex All Sex All Sex All 
456 275 475 
M F (936) M F (273) M F (475) 
(249) (207) (126) (149) (251) (224) 
acronym 1.6% 1.4% 1.5% 4.0% 4.0% 4.0% 20.5% 11.7% 16.3% 
formation 
clipping 18.1% 21.3% 19.5% 19.0% 49.7% 35.6% 16.5% 29.9% 22.9% 
- back 16.9% 17.4% 17.1% 15.9% 46.3% 32.4% 14.5% 25.6% 19.7% 
- fore = 3.9% 1.8% 3.2% = 1.5% 1.2% 2.2% 1.7% 
= edge = = = = 3.4% 1.8% 0.4% 1.8% 1.1% 
- middle 1.2% = 0.7% = = = 0.4% 0.4% 0.4% 
suffixation 11.9% 10.1% 11.2% 23.0% 4.7% 13.1% 21.3% 8.5% 15.3% 
clipping + 52.2% 48.8% 50.9% 35.7% 29.1% 32.4% 28.1% 37.7% 32.6% 
suffixation 
= reduplication 0.8% 1.0% 0.9% = 1.3% 0.7% 1.2% 3.6% 2.3% 
compounding 1.6% - 0.9% 0.8% 2.0% 1.5% 1.2% 3.1% 2.1% 
blending 1.2% 1.9% 1.5% 0.8% a 0.4% 2.8% 2.7% 2.8% 


defamiliarization 6.8% 11.6% 9.0% 11.9% 9.4% 10.5% 5.2% 3.6% 4.4% 
and word play 


other 0.4% 1.9% 1.1% 4.0% 0.7% 2.2% 1.2% 2.2% 1.7% 


free forms 6.0% 2.4% 4.4% 0.8% = 0.4% 3.2% 0.4% 1.9% 


18 Free forms are described separately (cf. section 7) because they do not result from the 
manipulation of a base and therefore cannot be described as a product of morphological or 
extra-grammatical processes. They are listed in Tables 6 and 7 for purposes of comparison with 
the other processes in terms of form and frequency. 


Nickname formation in West Germanic — 31 


6.1 Acronym formation 


Acronyms are most commonly formed by reducing names to initials based on the 
first letters of the legal name or parts thereof (G. MP < Marion Peter). Whereas G. 
and D. use such initialisms less frequently (1.5% and 4.0%, respectively), they are 
relatively frequent in E. (16.3%). Furthermore, acronyms from E. names are used 
much more often in forming male nicknames than female ones. In the three lan- 
guages, they are found in multitudinous forms: i) A single initial based on the 
first or last name (J < Jack, G < Gibbons; uncommon in G.); ii) two initials based 
on double first names or the first and last names (HW < Hans-Werner; AG < Alex 
Gilbert); iii) three or more initials (ABC < Alex Benjamin Carr; in Dutch, the last 
name van den Tol is clipped to VDT); and iv) initials in combination with a full 
name or shortening (JYau < Jason Yau; D-Mo < David Mott). Letters are sometimes 
written as pronounced (Floka (see Table 3), with ka /ka:/ as the pronounced form of 
the letter <k>; cf. also Dutch Pe < Patricia). 

Certain cases of nicknames in which clippings are combined by using sounds 
or syllables instead of letters are quite similar to many acronyms, especially when 
the first and last names are merged (G. MiGrii < Michael Griinwedel; D. Snoord < 
Sander Noordink). We will, however, treat them as clippings, see section 6.2 imme- 
diately below. 


6.2 Clipping 


Clipping is a major process of shortening words in all three languages and equally 
used to form nicknames. In this section, we discuss “pure” clipping (for com- 
bined clipping and suffixation, see section 6.4 below), starting with the obser- 
vation that D. employs pure clipping the most (35.6%), followed by E. (22.9%) 
and G. (19.5%). 

The most frequent type of clipping is back-clipping, which is expected in 
Germanic languages due to their word-initial accent. Back-clipping varies in 
the number of sounds deleted and usually involves a reduction in the number of 
syllables. We first calculated the back-clipping frequency with respect to the 
number of syllables: in D. (81.6%) and E. (89.8%), most of the resulting nicknames 
are monosyllabic (D. Tout < Toutenhoofd, E. Ri < Riley), whereas in G., most are 
disyllabic (66.3%). The last sound of many disyllabic G. nicknames is a vowel 
homonymous with one of the many suffixes that are popular in nickname forma- 
tion, namely, schwa (Ale < Alexander), -a (Katha < Katharina), -o (Karo < Karolin), 
and, most frequently, -i (Fabi < Fabian), cf. section 6.3 below. 
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Other types of clippings are rather marginal in all three languages. Interestingly, 
fore-clipping in G. is only found in female nicknames, while the situation in D. is the 
exact opposite. On the other hand, fore-clippings are restricted to first names in both 
languages and reflect first names whose first syllable is unstressed (G. Nine < Janine, 
Resa < Maresa, Bekka < Rebekka, Nessa < Vanessa; D. Dre < Andre, Thieu < Matthieu, 
Mon < Ramon). Female nicknames with this structure in D. are occasionally clipped 
at both ends (edge clipping, Les < Celeste, Ris < Mariska, Moon < Simone). 

Pure clipping is used less often for male nicknames than for female nick- 
names in D. This is probably a reflection of female first names being longer on 
average (2.4 syllables) than male first names (1.9 syllables). The fact that the 
opposite holds in first name suffixation (cf. section 6.3 below) supports this view: 
female names generally tend to be shortened, while male names occasionally 
undergo lengthening. The length of first names shows the same asymmetry in 
G. and E., yet there is no sex-based asymmetry in clipping vs. suffixation; instead, 
these languages make more use of a flexible combination of clipping and suffixa- 
tion (cf. section 6.4 below). 

In summary, clipping is productive in all three languages, especially D., with 
back-clipping being most productive across the three languages. Clipping mostly 
results in monosyllabics in D. and E., while G. shows many disyllabic clippings. 
D. exhibits a strong distinction in sex-based use, with clipping being most pro- 
ductive in female rather than male nicknames. 


6.3 Suffixation 


With percentages ranging from 11to 15, suffixation is not a predominant process 
of nickname formation in any of the three languages (unless combined with 
clipping, see section 6.4 below). While clipping is primarily employed with rather 
long first and last names in our data (on average G. 3, D. 2.5, E. 2.4 syllables), suffixa- 
tion mostly affects short legal names (average G. 1.7, D. 1.4, E. 1.1 syllables) and is 
thus mostly a means of lengthening. This may be why D. and E. apply suffixation 
mostly with male names, as male first names are generally shorter than female. 
Various types of suffixes are involved in suffixation. Some suffixes are homony- 
mous with derivational suffixes like -er, which is typically used to construct agent 
nouns (G. Maiker < Maik, D. Schrager < Schrage; E. Lopper < Lopp) or inflectional 
suffixes like -(e)s (G. Rammes < Ramm; E. Pontins < Pontin).'? We also find existing 


19 In E. Youngen < Young, -en may have been inspired by the unproductive plural suffix -en 
(oxen), but could just as well have arisen by mere chance. 
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onymic suffixes, predominantly loan suffixes, being used in the formation of last 
names (G. Klausson < Klaus; E. Wayneski < Wayne, Jordanovic < Jordan) and the 
Portuguese diminutive suffix -inho which is well-known from the names of vari- 
ous Brazilian soccer players like Ronaldinho (G. Nansyinho < Nansy, D. Robinho < 
Robin).”° 

While E. has no productive diminutive suffix, G. and D. do. Compared with G. 
and its diminutive suffix -chen, D. takes much more advantage of the equivalent 
suffix -je in nickname formation (D. 278% and G. 15.7% of all names in this cate- 
gory, respectively). The allomorphy rules of D. mostly apply: the allomorph -tje is 
used if the legal name ends in an alveolar sonorant or a vowel (Carooltje < Carola, 
Guytje < Guy vs. Rikje < Rik). In G., umlauts are employed in established names 
or homonyms of lexical items (Simönchen < Simone, Wölfchen < Wolff, a last name 
homophonous with Wolf ‘wolf’), although not necessarily (Julchen < Julia, Karlchen 
< Karla). 

The use of diminutive suffixes fits well with the hypocoristic nature of posi- 
tive nicknames. The most frequent suffix in all three languages is -i (G. 51.0%, 
D. 41.7%, E. 72.2%), mostly realized graphically as <i>, <ie> or <y> (G. Schulzi < 
Schulz, Kimmi < Kim, D. Mikkie < Mik, Derkie < Derk, E. Bergie < Berg, Quinny < 
Quinn). This extremely popular suffix is clearly associated with prototypical nick- 
names in all three languages. It has been described as hypocoristic in all three 
languages and is also used productively in shortenings in G. (Compi < Computer 
‘computer’, Studi < Student ‘student’; cf. Köpcke 2002). While -i is slightly more 
popular in female nicknames than male ones, but nevertheless highly frequent 
with both sexes, the suffix -o is reserved for male nicknames (G. Daniello < Daniel, 
D. Jorrito < Jorrit, Kanto < Kant, E. Tommo < Tom, Willo < Will).?' The suffix -o is 
also used in G. shortenings, where it carries a pejorative meaning (Anarcho ‘anar- 
chist’, cf. the full form Anarchist). This connotation may account for its preference 
for male nicknames, supporting the stereotype of roughness and toughness. 

Apart from derivational, inflectional, and onymic suffixes, a few infrequent 
suffixes are nickname-specific. Largely extensions of -i and -o (G. Timbo < Tim, 
D. Jonko < Jon, E. Taitso < Tait), these suffixes help form expressive consonant 
clusters (cf. section 5.2 above) and may be regarded as allomorphs of vowel-based 
suffixes. 


20 Cf. also the G. feminization suffix -ine (Olafine < Olaf) and the E. suffix -ers from last name 
formation (Lovers < Lovegrove). 

21 For Australian English, Taylor (1992) reports that -o and -i are used for distinguishing between 
first and last names as bases of nicknames, cf. Stevie < Stephen vs. Stevo < Stephenson. There is 
no such distribution with -o and -iin our data. 
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6.4 Clipping and suffixation combined 


The two processes introduced above are often used in combination. While suffixa- 
tion can only lengthen names and clipping only shorten them, both processes in 
combination are able to produce nearly any kind of desired output. In G., their 
combination is by far the most frequent source of nicknames (G. 50.9%); in D. 
(32.4%) and E. (32.6%), the two processes are combined less frequently, but still 
quite often. 

The variety of suffixes is much stronger when combined with clipping. In G. 
and D., the i-suffix is equally as dominant as in pure suffixation (55.6% and 
43.8%, respectively; only 26.6% in E.). By contrast, E. exhibits a strong preference 
towards suffixes with -s when combined with clipping. The -s suffix (30.5%) is 
most commonly used for creating monosyllabic nicknames (Tins < Tina, Bex < 
Becca) while syllabic forms ending in -s (most frequently with -ers, 7.1%) are 
found in disyllabic ones (Hockers < Hocknell, Meggers < Megan, Strudders < Strud- 
wick). The -s suffixes constitute the most frequent type for forming female nick- 
names in E., with -i being the most frequent type for male nicknames.” It should 
be noted that -i occurs more frequently in male rather than female nicknames, the 
opposite of which holds true for G. and D.; cf. the frequencies of final sounds in 
sections 5.1 and 5.2. Both syllabic and unsyllabic types of -s suffixes are found in 
G. and D., too, but are used rather marginally. 

All languages use the suffix -o in combination with clipping (although only 
marginally in D.), without being fully restricted to male names (G. Julo < Julia, 
E. Kelso < Kelsey). However, its frequency with male names in G. is considerably 
higher (M 7.7% F 1.0%). 

G. and D. also use two types of suffixes that do not exist in English. 1) Dimin- 
utive suffixes are found in a stronger variety of forms than in pure suffixation. 
Apart from the Standard G. -chen, the dialectal -le (Djole < Djordje) and -(e)l are 
also used (Xandl < Alexandra, Resel < Theresa) in addition to the written dialectal 
pronunciation -sche [fa] for -chen (in Sofflsche < Sophia even in combination with 
an -l suffix). In D., the standard form -je is accompanied by dialectal or Frisian 
forms as -(e)ke, cf. Miranneke < Miranda, Ceke < Cedrik. Diminutive forms provide 
2.6% of all G. and 13.5% of all D. nicknames of this type. 2) The schwa suffix is 
rather strong in both languages (G. 9.5%, D. 5.6%) and primarily used for the 
formation of male nicknames (G. Sebbe < Sebastian, Lense < Lensing; D. Joene < 
Jeroen). 


22 This is in stark contrast to earlier nickname studies where -i was clearly more frequent in 
female names, cf. Cutler/McQueen/Robinson (1990: 478), de Klerk/Bosch (1997: 298). 
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Reduplication occurs more frequently in English (2.3%) than in the other two 
languages (E. Lele < Leia, Du-Du < Du Frane), but is still rather marginal. Occa- 
sional rhyme pairs have also been found (G. Reusel Meusel”? < Reusing, D. Ellebel < 
Ellen). 

We finally consider the frequency of the suffixes when no distinction between 
regular suffixation and suffixation combined with clipping is made. Table 8 pro- 
vides an overview of the most frequent suffixes with their percentage values. 


Table 8: Suffixes and their frequencies in pure suffixation and in clipping combined with 
suffixation (percentage among all suffixed nicknames in the database) 


Type German Dutch English 
Sex All Sex All Sex All 
M F M F M F 


-i 51.9% 58.5% 54.8% 45.9% 39.2% 43.2% 50.4% 30.1% 41.2% 


-0 6.9% 0.8% 4.2% 5.4% = 3.2% 12.2% 6.8% 9.7% 
-[a] 13.1% 3.3% 8.8% 5.4% 2.0% 4.0% = - - 
-S 0.6% 0.8% 0.7% = 5.9% 2.4% 14.6% 30.1% 21.7% 


-ers - - - 


Diminutive 1.9% 16.3% 8.1% 16.2% 21.6% 18.4% = = = 
suffixes 


We can derive a number of general tendencies from the distribution: a) The -i suffix 
is dominant across the three languages. b) While G. and D. mainly use syllabic 
suffixes, E. uses the non-syllabic suffix -s with greater frequency. c) The -s suffix 
is marginal in G., and in D. and E. it is much more frequent with female nick- 
names than with male ones. d) Diminutive suffixes are mostly reserved for female 
names in G. but appear more often and with both sexes in D. e) The -o and schwa 
suffixes have a male connotation and are rare in female names. 


23 Meusel might be a diminutive form of Maus ‘mouse’ (Mdusel) in which the umlaut is obscured 
in writing. Thanks to an anonymous reviewer for this observation. 
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6.5 Compounding 


In compounding, the legal name is used in combination with a lexeme to form a 
compound, a strategy which is rather infrequent in all three languages (G.: 0.9%, 
D.: 2.0%, E.: 2.1%). Interestingly, G. compounds are usually formed with the legal 
name in final position (Partykarl ‘party’ + Karl, Keilriemen Kock ‘fan belt’ + last 
name Kock, Drogen Marc ‘drugs’ + Marc, etc.), while English displays the reverse 
order (Clare Bear, resembling care bear, Cole World, Zoebug, Danny Boy, Kayla 
Bug, Bower Power, Mika-mouse, resembling Mickey Mouse, etc.). In terms of tradi- 
tional, i.e. standard, determinative compound formation, in G. the name is deter- 
mined by a lexical item (Partykarl ‘Karl is frequently found at parties’), while in 
E. the name determines a class of items (Kayla Bug ‘a bug of the Kayla type’). 
Nicknames are interpreted differently, of course, but the relative position of the 
items in compounds may account for G. allowing a greater variety of words in first 
position, whereas E. mainly has terms for animals and human beings in final 
position. In D., both types are found with similar frequencies (Boemboem Mikey, 
All-in Adam vs. Loesbal “Loes + ‘ball’”, resembling voetbal ‘soccer’, Maaskantje 
“last name Maas + kantje ‘edge’”, Tonygoal). 


6.6 Blending 


Blending is rather marginal in all three languages. While D. has just one blend in 
the entire dataset, 1.5% of all G. and 2.8% of all E. nicknames are blends. While 
there are blends from both parts of the legal name (G. Schmiri < Miriam Schmitz, 
E. Wex < Alex Wendler), most blends consist of a name and a noun (or another 
personal name). The parts derived from the legal name may come first in the 
blend (G. Ankaninchen < Anika + Kaninchen ‘rabbit’; D. Rombocop < Romboud + 
Robocop; E. Mikattack < Mika + attack, Eveready < Everett + (ever) ready, Fulldog 
< Fuller + bulldog) or last (G. Promillhard < Promille ‘ppm (alcohol level)’ + Geb- 
hard, Skandalex < Skandal ‘scandal’ + Alex < Alexander; E. Zeustas < Zeus + Ustas). 
While G. uses both types, E. nickname blends are mostly of the former. 
Regarding the formation of the blends, we use Ronneberger-Sibold’s (2006) 
typology. According to this typology, most blends in the sample are transparent, 
viz. so-called complete blends: both parts are maintained in full and coalesce, 
with the end of the first part overlapping with the beginning of the second (a 
so-called telescope blend), cf. Mika and attack in Mikattack. Additionally, so- 
called contour blends are frequent: in Machinez < Martinez, for instance, the 
trisyllabic form and rhythmic structure of Martinez is used as a matrix word which 
incorporates machine. Shortened forms are sometimes used for such processes, 
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cf. Rombocop, which incorporates the shortened form Romb < Romboud into the 
matrix word Robocop, or Promillhard, combining the matrix word Promille with 
the shortened form hard < Gebhard. 


6.7 Defamiliarization 


On occasion, names are found in a deliberately defamiliarized forms. Defamiliar- 
ization that results in homonymy with an existing noun or name (G. Aldi, other- 
wise a supermarket chain < Altmann; D. Kaas, otherwise ‘cheese’ < Keesjan) is not 
part of the present study (see the restrictions mentioned in section 4 above). How- 
ever, some kinds of defamiliarization occur without resulting in homonymy with 
an existing name or lexeme. In some cases the outcome is clearly related to the 
base but cannot be assigned to any of the categories introduced above (G. Tiffi < 
Stephanie, Tinna < Tina). Some defamiliarization affects only the written form 
(cf. D. Flen3 < Flendrie, with drie ‘three’), in other cases it affects the pronuncia- 
tion. While many instances of defamiliarization appear to be rather idiosyncratic 
(E. Shewan < Shannon, Dank-sho < Donagan), a degree of systematicity can be 
identified, e.g. in the use of specific variants (G. dialectal Kerschdin < Kerstin) or 
languages: G. Sdrah < Sarah, Anna < Anna, Ddif < David, likely inspired by cor- 
responding E. first names with a fronted /a/; G. Darco < Marco may be a blend of 
E. dark and Marco; D. Rows < Rosan with a graphic indication of the diphthong as 
pronounced in English. Curiously, the umlaut <ä> is frequently used in the crea- 
tion of G. nicknames even when they are not related to an English form (Päskä < 
Pascal, Sab < Sabrina, Jansch < Janina, Mäthe < Mathias). 

In D., there are cases where the onset consonants of the first and the last 
name are switched (Wim Tielns < Tim Wielens, Raaf Doelofs < Dave Roelofs), again 
suggesting the highly creative and playful character of nickname formation. This 
is also evident in D. nicknames swapping vowels (Nocci < Nico) and E. nicknames 
using letters from parts of a name or mere syllables, cf. B-Ry < Bryan, B-ren < 
Brenna, Ba-la-ke < Blake. Defamiliarization and word play are used more frequently 
in G. (9.0%) and D. (10.5%) than in E. (4.4%). 


6.8 Other cases 


There are certain nicknames in all three languages (G. 1.1%, D.: 2.2%, E.: 1.7%) 
that do not correspond to any of the formation types listed above. For example, a 
few nicknames are formed by changing the beginning of the base, somewhat like 
prefixation although no prefixes are involved, occasionally including shortening 
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(G. Mulf < Ulf, Spritta < Britta, D. Flaris < Caris, E. Crim < Tim). In D., a number of 
last names incorporating a preposition and an article of the van de(r) X type are 
reduced to forms lacking these parts, cf. Sluis < van der Sluis, Kreeke < van de 
Kreeke.” Other cases are again rather idiosyncratic and do not show any recog- 
nisable pattern (cf. E. Been < Ben). 


7 Word creation in nicknames without a base 


The dataset contains a number of nicknames that do not resemble the correspond- 
ing legal names, but are not homonymous with other lexical or proprial items 
either. They do not seem to have any recognizable base and are therefore likely to 
represent entirely innovative name creations.” Strikingly, the D. data contain just 
one name of this kind (Sjiemelle), whereas E. has nine cases of such freely formed 
nicknames (1.7%) (cf. (1) below) and G. 20 (4.4%) (cf. (2) below), with considerably 
higher frequencies of male nicknames. 


(1) English nicknames without a base 
a. male: Chegs, Ders, Guence, Joogs, Nandy, Stoff, Udzy, Walzy 
b. female: Patoonch 


(2) German nicknames without a base 
c. male: Babba, Bane, Buggi. Ji-C, Knusterich, Negaaaaa, Nosti, Palo, 
Patzi, Piwi, Pötzi, Quixxen, Rasi, Zaern, Zotze 
d. female: Issi, Sagankl, Tudt, Tuff, Vogal 


24 Whether these kinds of names are regarded as nicknames or not is a matter of definition. 
Since names starting in van de (similar to names starting only with a determiner like de or a 
preposition like van) are characterized by a clear pattern with only one variable part, nicknames 
based on the only truly individual part of the name may conceivably be regarded, not as nick- 
names, but rather as simply the relevant part of the last name. Their linguistic status is therefore 
unclear. However, since such names were provided as nicknames in the internet profiles, they 
were included as such in the database. 

25 These items were identified as freely coined, but we cannot be fully certain whether this 
is invariably correct. Nicknames that look as if they had been created from scratch may in fact 
reflect words from varieties or names that were unknown to the team of linguists working on the 
database. Although internet searches were conducted for all words that were not clearly cate- 
gorisable, some uncertainty remains because of the great variety of vernaculars and styles on 
which the coinages may conceivably be based. 
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Despite strong idiosyncrasies, some tendencies can be identified, and certain 
general characteristics of nickname formation as described above are reflected in 
freely formed nicknames. In G., most such names (15 of 20 names) are disyllabic, 
whereas in E. monosyllabics (5 of 9) outweigh the disyllabics (4 of 9). E. mono- 
syllabic nicknames end in -s in 4 of 5 cases (Ders, Joogs), whereas G. monosyllabic 
nicknames end in closed syllables. In both languages, many disyllabic nicknames 
end in a full vowel. While -i is a popular ending in both G. (7 of 15 disyllabic free 
nicknames, cf. Nosti, Buggi) and E. (3 of 4 names, cf. Nandy, Walzy), -a and schwa 
each occur twice at the end of G. nicknames (Babba, Bane). The number of nick- 
names containing consonant clusters in -s is conspicuous (E. Chegs, Udzy; G. Patzi, 
Pötzi (players from different teams), Quixxen, Zotze). On the other hand, there are 
also nicknames with simple, ambisyllabic consonants (Babba, Buggi, Issi). 

Occasionally, such nicknames resemble existing lexical items, either formally 
(cf. Babba, reminiscent of Papa ‘dad’) or by using word formation elements such as 
diminutive (Sagankl) or moving suffixes (G. Knusterich). This again underscores the 
uncertainty of categorization associated with this type of nickname (cf. footnote 25). 

In sum, while the free formation of nicknames is generally unrestricted, most 
such nicknames conform to the syllabic characteristics as outlined above for 
nicknames based on legal names. Others reflect the general characteristics of 
lexical items. 


8 Results 


Our data show that nicknames are formed in a great variety of ways in the three 
languages under consideration. Given the narrower scope of data that is not 
homonymous with lexical items or existing names, the diversity is even greater 
than it appears here. And yet, there are clear, highly frequent patterns in the data. 
While nearly anything is possible in nickname formation in principle, nickname 
formation does seem to be governed by prototypes. Actual nicknames may instan- 
tiate the prototypes entirely or in part, or an ad-hoc solution may be applied. 

As Köpcke (2002) has suggested for i-shortenings in G., the prototypes are 
organized according to cognitive schemas. Such schemas are defined by bundles of 
phonological-prosodic and semantic characteristics which together circumscribe 
a prototype structure. Nicknames may be coined with reference to such schemas, 
whose prototypical characteristics they instantiate either entirely or to a given 
degree. Two schemas can be identified as particularly relevant in the data: the 
schema of disyllabic nicknames ending in -i (Conni) and the schema of mono- 
syllabic nicknames ending in a fricative (Megs). 
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In all three languages, nicknames most frequently instantiate disyllabics with 
final -i (cf. Fig. 2). They are all hypocoristic, suggesting a positive, friendly relation- 
ship between name giver and bearer, and associated with softness and tender- 
ness rather than roughness and toughness (cf. Wierzbicka 1992: 378).2° Although 
prosody cannot be read off the names in their written form, earlier studies and 
our experience with nicknames in the three languages suggests that a great 
majority of the disyllabics form trochees, i.e. they are stressed on the first syllable. 
The link between the two syllables usually consists of a single consonant. 


prototype mm 
disyllabic trochee + multisyllabic ending in a trochee Manolo 
open syllable Erline 
final -i Ini-Bini, Boltini, T-Dougie . 
linked by a single Alanjo 
consonant Carooltje 
Ehvbochmiete + closed syllable 
HR final full vowel final vowel Tobse 
Conni, Passie, Thanny +/-tough a Stough +/-tough . Robser 
Joha, Jojo, Robbo Päde, Joene Tibis, Hannek, Darbis 
Genkes 
+ linked by consonant cluster Nankers 
Basti, Brancii, Mortsy Mortel 
Bettschgo 


Fig. 2: The schema of disyllabic nicknames ending in -i 


Many nicknames that refer to this prototype instantiate all its characteristics, 
while others diverge to a certain extent, e.g. in terms of the number of syllables, 
the final sound, or the link between the syllables. In Figure 2, such more periph- 
eral items are placed on the right. To save space, only the diverging features are 
mentioned; any unmentioned characteristics are in line with the prototype. 
Divergences from the prototype can also result in changes in the connotation of 
nicknames. For instance, final -o or schwa” can produce a ‘toughness’ interpreta- 
tion in addition to hypocorism. At the right-hand side of Figure 2 are also names 


26 Although we avoid using ‘feminine’ and ‘masculine’ in this context, the associations are 
clearly linked to sex role stereotypes. 

27 -o is used as a pejorative suffix in G., for example. Schwa endings have been interpreted as 
augmentative in D. (historically -en with n-apocope), e.g. in Van Langendonck (1978: 6; 1999: 249). 
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that diverge from the prototype in various ways at once. The idea behind this 
representation is that nicknames are formed with reference to this specific proto- 
type, with broader schemas (e.g., ending in a full vowel) at hand that allow for the 
formation of nicknames still resembling the main prototype to a certain degree. 

Conjecturing why this specific schema has become so popular across the three 
languages, an initial consideration may be that the prototype diverges from the 
structure of “normal” lexical items and therefore produces salient words.?® For 
example, “normal” lexical items in West-Germanic languages are usually mono- 
or disyllabic and either end in a closed syllable or (often in G., more rarely in D.) 
in schwa. Full vowels are infrequent in unstressed syllables because of the weak- 
ening of unstressed vowels that all three languages underwent during the medieval 
period. As shown above, when language users are free from the restrictions of the 
lexicon, words that are very different from traditional lexical items seem attrac- 
tive. On the other hand, the less a nickname resembles the main nickname proto- 
type, the more it tends to resemble traditional lexical items (cf. the types Pdde, 
Robser). Such items often have a strikingly sex-based distribution, e.g. disyllabics 
ending in schwa in G. male nicknames. Any resemblance to traditional lexical 
items seems to be acceptable when such a connotative effect is to be obtained. 

A second schema that can be observed in the three languages produces mono- 
syllabics ending in a sibilant. This type, which is very productive in D. and E. but 
barely in G., carries a connotation of toughness and roughness in E. according to 
earlier studies (cf. Wierzbicka 1992: 378) and stands in a complementary relation- 
ship with its disyllabic counterpart. Figure 3 again shows the distribution of nick- 
names in accordance with this prototype. Nicknames that diverge from the proto- 
type end in, e.g., a consonant cluster or other types of consonants, with fricatives 
and other obstruents being closest to the prototype. While the disyllabic prototype 
shown in Figure 2 is represented across all three languages, the individual lan- 
guages show different preferences here: D. and G. prefer monosyllabics ending in 
any kind of fricative rather than specifically in a sibilant. The principle of organi- 
zation, however, is the same. 


28 In her comparison of German shortenings with other, more typical lexical structures, Ronne- 
berger-Sibold (1995) similarly identifies a prototype which is maximally dissimilar from the tradi- 
tional lexicon; cf. also Kürschner (2018), where nicknames and product names are added to the 
comparison. Note that disyllabics ending in -i are also far from the usual structures of legal first 
names, at least in G. and D., thus distancing the nicknames from traditional lexical items as well 
as from the legal names on which they are based. 
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prototype > 
monosyllabic + final consonant cluster Marv 
closed syllable Sabs, Rens, Drinks Jord 
final sibilant 

Corn 


final single consonant 


Er, final fricative final obstruent final consonant 
+hypocoristic 
otis +/-tough +/-tough +/-tough 
$ Jev, Daph Flip, Mriek, Clayt | Böm, Vaan, Hil 


Sash, Bous, Maze 


Fig. 3: The schema of monosyllabic nicknames ending in a sibilant 


Having introduced the two main prototypes, let us recapitulate how nicknames are 
formed. Firstly, formation seems to be output-oriented rather than input-oriented. 
As found by Köpcke (2002) in his analysis of i-formations in G., the orientation 
towards the output described by the schemas can be achieved by differing mor- 
phological means such as clipping, suffixation, or (particularly useful) combina- 
tions of both. It is not the morphological process itself but rather the conformity 
of the output with the schema that is at the core of nickname formation. This 
accounts for the flexibility of nickname formation. Given the base form Dominik 
in G., likely output forms are Mini (by clipping), Domi (by suffixation) and Niki 
(by clipping and suffixation together), as are Dome, Mino, Nika (all by clipping 
and suffixation) and many others. Alternatively, language users may coin more 
individual, less schema-oriented nicknames using other types of name manipu- 
lation such as compounding or blending. Despite overall orientation towards the 
schemas, the phonetic and morphological freedom in the coining of nicknames is 
considerable. 

In contrast to E. and D., G. makes strikingly little use of the second schema. 
A reason may be this prototype’s relation to sex-role stereotypes, which seems to 
be reflected less in nicknaming in G. than in the other two languages. In her 
comparison of the sound structure of G. nicknames with the sound structure of 
regular first names, Nübling (2014) finds that the typical differences between 
male and female first names (final sound, number of syllables, word accent, rela- 
tive number of vowels/consonants) are only marginally reflected in nicknames. 
Trochees in -i are mainly associated with femininity in first names, but in nick- 
names this structure is as productive with men as it is with women (as indeed 
confirmed by our data). Nübling suggests that the in-group character of nick- 
names makes the phonological marking of sex unnecessary: while first names 
are used when introducing new people and can be used without any knowledge 
of the referent (thus making the marking of sex more necessary), nicknames are 
usually coined among groups of users who know each other well. Nicknames 


Nickname formation in West Germanic — 43 


therefore do not need to carry sex-related information, whereas more subjective 
information such as the positive attitude suggested by a hypocoristic form, for 
instance, is highly relevant. 

Nevertheless, certain patterns in our data are related to the sexes. Although 
specific patterns are rarely reserved exclusively for either sex, some are clearly 
preferred in male over female nicknames or vice versa. Frequency distributions 
in our data suggest that patterns may indeed evoke at least a vague sense of femi- 
ninity or masculinity. For example, disyllabics in -o have an analogous tendency 
to occur in male names across all three languages. Where specific patterns are 
concerned, however, the apparent sex-orientation of nickname patterns is cross- 
linguistically far from rigid. A final schwa in disyllabics, for instance, often occurs 
in male names in G., but with no gender-related distribution in D. Final -s occurs 
in female monosyllabic nicknames in D., but is more or less sex-neutral in E. 
Monosyllabic nicknames (specifically of the second prototype) tend to be female 
in D. and E., male in G. This runs counter to existing studies of sound symbolism 
and suggests that otherwise “male” structures are used for women’s nicknames 
as a playful way of breaking the mould of sex-based role expectations.” As we 
saw, structures associated with any particular sex also vary and there is no clear 
association of specific patterns with any one sex. Culture- and language-specific 
patterns do arise, even in such closely related languages as G., D. and E. 

Summarizing the contrastive results from this study, there are a great number 
of parallels between the three Germanic languages, confirming their close linguis- 
tic ties, but also numerous divergences. What do the parallels and divergences 
suggest about the relationship between the three languages? For example, does 
the picture of D. between G. and E., as proposed by van Haeringen (1954) on the 
basis of morphosyntactic properties, hold for the formation of nicknames? Table 9 
lists a few parallels and divergences between all three languages or pairs of 
them. 

The table shows that D. shares some characteristics with E. and some with G. 
The strong use of monosyllabic output is characteristically absent in G., whereas 
E. similarly lacks the phonological characteristic of allowing schwa suffixes and 
the morphological use of diminutive markers that are shared by G. and D. A few 
features are shared between G. and E. but not D., for instance the dominant use 
of monosyllabics ending in final fricatives for male nicknames. Mostly, however, 
features are shared between G. and D. or between D. and E. Based on these obser- 
vations, it is indeed appropriate to say that D. occupies an intermediary position 


29 Since our data stems from sports teams, the factor that most of them are single-sex teams 
might play a role here. 
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between G. and E. in terms of the main characteristics of nickname formation. In 
certain respects it clusters with both languages, while G. and E. tend to be keep 
more distance from one another. 


Table 9: Parallels and divergences in the formation of German, Dutch, and English nicknames 
(dashed line and empty cell: phenomenon much rarer in the language with the empty cell) 


German Dutch English 
Frequent Output 1 disyllabic trochees in final -i / full vowel 
Frequent Output 2 monosyllabics in final -s / fricative 
Endings schwa, diminutive suffixes - 
Processes defamiliarization and word play 
Acronym formation seldom occasional often 
Monosyllabics in fricative male female male 


In summary, despite the considerable phonetic and morphological freedom 
with which nicknames can be coined in G., E. and D., they seem largely to be 
formed according to just two prototypical schemas. These schemas may serve as 
a starting point for future studies, for instance by including diachronic data to 
examine the historical development of the schemas described in our study. The 
main schema for positively connotated nicknames in our data, viz. the disyllabic 
trochee in -i, is unlike typical lexical items and therefore well-suited to makes 
nicknames recognizable as such. Various (mostly language-specific) patterns allow 
the nicknames to bear sex-based connotations. 

Between them, the three languages in focus exhibit clear parallels in the 
formation of nicknames but also diverge from each other in various respects. D. 
shares a number of features with E. and others with G., but E. and G. only rarely 
cluster together against D. Based on these observations, it is fair to conclude that 
a Germanic sandwich, with Dutch between German and English, exists in the 
realm of nicknames. 
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Analogues of the way-construction in 
German and Dutch: another Germanic 
sandwich? 


Abstract: This paper addresses the English way-construction [SUBJ, V POSS, way 
OBL] and its reflexive analogues in German and Dutch. We argue that the different 
constructions are best compared using conceptual terms describing middle situ- 
ations in the domain of autocausative motion (Kemmer 1993). Two dimensions 
are especially important: path traversal and goal-directedness (or telicity). It will 
be shown that way-constructions and their analogues can be arranged along these 
dimensions. Moreover, there is a general parallel tendency for newer construc- 
tions to occupy the domain of ‘path traversal’. In English, this development has 
resulted in the way-construction being dominant at the cost of the historically 
prior reflexive resultative construction. In Dutch, the weg-construction, which 
expresses path-traversal, competes with the more generally established Transition- 
to-Location Construction, which specialises in the expression of telic transition of 
location. In German, finally, there is no schematic Weg-construction: the entire 
conceptual space of autocausative motion is covered by reflexive constructions 
- either instantiations of a more general reflexive construction [SUBJ V sich OBL] 
or inherently reflexive verbs. 


Zusammenfassung: Der vorliegende Beitrag behandelt die englische way-Kon- 
struktion [SUBJ, V POSS, way OBL] und die äquivalenten Reflexivkonstruktionen 
des Deutschen und des Niederländischen. Es wird gezeigt, dass sich diese unter- 
schiedlichen Konstruktionen am besten mithilfe von Kategorien vergleichen las- 
sen, die für die Beschreibung von Medialsituationen im Bereich der autokausa- 
tiven Bewegung entwickelt wurden (Kemmer 1993). Vor allem zwei Dimensionen 
erweisen sich dabei als wichtig: Zurücklegen eines Pfades und Zielgerichtetheit 
(oder Telizität). Außerdem wird gezeigt, dass die way-Konstruktionen und ihre 
Äquivalente entlang dieser Dimensionen geordnet werden können und dass jün- 
gere Konstruktionen dazu neigen, zunächst die Domäne des ‘Zurücklegens eines 
Pfades’ zu besetzen. Im Englischen hat dies dazu geführt, dass die way-Konstruk- 
tion die historisch ältere reflexive Resultativkonstruktion weitgehend verdrängt 
hat. Im Niederländischen befindet sich die weg-Konstruktion, die das Zurück- 
legen eines Pfades ausdrückt, in Konkurrenz mit einer bereits etablierten Kon- 
struktion, die auf den Ausdruck eines telischen Ortswechsels spezialisiert ist. Im 
Deutschen gibt es schließlich keine schematische Weg-Konstruktion: Hier wird 
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die konzeptuelle Domäne der autokausativen Bewegung vollständig von Refle- 
xivkonstruktionen abgedeckt - sei es von Instanziierungen der allgemeineren 
Reflexivkonstruktion [SUBJ V sich OBL] oder von inhärent reflexiven Verben. 


1 Introduction 


In this paper, we address the English way-construction as exemplified in (1), one 
of the classic constructions of English described in Goldberg (1995), and its ana- 
logues in Dutch and German, which differ from the English one in at least two 
respects. First, both involve the obligatory presence of the weak reflexive marker 
zich/sich, which is not present in the English construction. And second, although 
Dutch allows the NP een weg, cf. (2), German does not have a productive construc- 
tion with a einen/den/seinen/ihren Weg NP. Instead it makes use of a reflexive 
construction (3) to render the meaning of the English way-construction. 


(1) He swam his way into the final. 
(2) Hij zwom zich (een weg) naar de finale. 


(3) Er schwamm sich ins Finale. 


That German and Dutch use a reflexive marker while English does not reflects a 
more general trend, as it is well-known that the use of reflexive markers is more 
constrained in English than in Dutch and particularly in German. Steinbach (2002: 
46ff.), for instance, notes that the English reflexive cannot be used as a middle 
marker at all (on our use of the notion ‘middle’, see section 3.1), whereas both 
Dutch and German allow the weak reflexive zich/sich in so-called anticausative 
constructions, see (4), and only German has reflexive sich in facilitative construc- 
tions of the type illustrated in (5) (for similar observations, see also Oya 2002, 
2003). 


(4) The door slowly opened / De deur opende zich langzaam / Die Tür öffnete 
sich langsam. 
(5) The book reads easily / Het boek leest (*zich) gemakkelijk / Das Buch liest 


sich leicht. 


In the same vein, only in German can sich be used with reciprocal meaning, whereas 
both English and Dutch make use of other strategies to express reciprocity. 
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(6) They greet each other | Ze begroeten elkaar/mekaar | Sie grüßen sich. 


This seems to suggest the existence of a (kind of) reflexive cline on which English 
occupies the leftmost position and German the rightmost. It is this cline which we will 
try to make more specific in this paper, as we will use it to account for the different 
constructions used in Dutch and German to render the English way-construction. 

The paper is organized as follows. In section 2, we will address the English 
way-construction and its main alternatives in Dutch and German in detail. Sec- 
tion 3 zooms in on the meaning potential of reflexives in English, Dutch and 
German, especially in the domain of autocausative motion (see Kemmer 1993; 
GeniuSiené 1987), and tries to link this potential to the constructions discussed 
in section 2. Section 4 presents a short conclusion and outlook. 


2 The way-construction and its equivalents 
in Dutch and German 


English has several means to describe goal-directed motion along a path, and the 
way-construction is one of them. Its formal and semantic properties as well as the 
differences between the way-construction and a less common but semantically 
related reflexive resultative construction will be discussed in the following section. 
Sections 2.2 and 2.3 will address counterparts of the way-construction in Dutch 
and German respectively. 


2.1 The way-construction in English 


The English way-construction is a productive, non-compositional construction with 
idiosyncratic syntactic and semantic properties that were described in Jackendoff 
(1990) and Goldberg (1995, 1996), among many others. It consists of a (typically 
agentive) subject, a verb, and two post-verbal elements: a noun phrase containing 
the noun way following a possessive pronoun that is co-referential with the subject 
on the one hand, and a directional adverbial describing the path that is created 
by the action expressed in the verb on the other. Typical instances of this formal 
pattern are presented in (7a-c): 


(7) a. Frank dug his way out of the prison. 
b. Sam joked his way into the meeting. 
c. The hikers clawed their way to the mountain top. 
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These examples illustrate some of the peculiarities of the construction. First, it 
is clear that the NP containing way cannot (any longer) be regarded as a straight- 
forward direct object of the verb since the verb in the way-construction can be 
intransitive, as in (7b) and (7c). Second, although the way-construction always 
denotes motion along a path, it is perfectly compatible with verbs that do not 
express motion at all, as example (7b) demonstrates. The construction can there- 
fore be regarded as a non-compositional “constructional idiom” (Jackendoff 1990: 
221) whose meaning cannot be predicted on the basis of its components, but is 
directly associated with the construction itself. Regarding this meaning, Goldberg 
(1995: 207) argues that the way-construction prototypically involves both creation 
of a path! and movement along this path. Since the path is mostly not pre-estab- 
lished, but created by the subject referent, the movement along the path is often 
perceived as difficult or hindered by obstacles. Note that there is no consensus in 
the literature with respect to (the classification of) the meanings expressed by the 
way-construction. Goldberg (1995, 1996) distinguishes a more basic (and hence 
much more frequent) means interpretation, in which case the verb denotes the 
means by which the path was created, from a derived (and hence less frequent) 
manner interpretation, with the verb denoting an action that merely accompa- 
nies the motion (without the implication that a path is being created), as in (8): 


(8) They were clanging their way up and down the narrow streets. 
(Goldberg 1995: 209) 


In his paper on the diachrony of the way-construction,? Israel (1996), on the other 
hand, distinguishes between three senses: a means (or path-creation) sense, in 
which the construction expresses the idea that the subject creates a path and 
moves along it, often with some difficulty; a manner sense, in which the verb is 
a motion verb that specifies aspects of the way motion occurs (see example (9), 
taken from Israel 1996: 222); and a so-called incidental activity sense, whereby 
the verb codes “some incidental activity that happens to accompany motion” 
(Israel 1996: 225). This incidental activity often involves the production of a par- 
ticular sound along with the motion, as in (10) (example from Israel 1996: 225).? 


1 This is consonant with the fact that the most frequent verb in the present-day way-construction 
is make. 

2 Detailed information on the rather complex historical development of the construction can 
also be found in Traugott/Trousdale (2013: 76-91). 

3 Somewhat confusingly, Israel’s incidental activity sense equals Goldberg’s manner sense, 
whereas his manner sense is subsumed in the means category in Goldberg’s account. In the 
remainder of this paper, we will follow Israel’s more precise classification. 
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What unifies the three senses is the fact that they all involve motion along a path. 
Note that both the manner and the incidental activity sense have a considerably 
lower token frequency than the more central means sense (see Perek 2018). 


(9) She fumbled her way down the dark stairs. 


(10) He whistled his way to the main front door. 


Before we turn to the Dutch and German analogues of the English way-construction, 
an English ‘competitor’ of the way-construction should be introduced. Mondorf 
(2011) describes the division of labor between the English way-construction and 
an older reflexive construction, which she terms ‘resultative’, as illustrated in (11b). 


(11) a. She worked her way to the top. 
b. She worked herself to the top. 


In fact, Mondorf (2011) argues that the reflexive resultative construction is being 
progressively ousted by the way-construction, given that the way-construction is 
about four times as frequent as the reflexive one in Mondorf’s present-day Eng- 
lish corpus material (ibid.: 405). The reflexive construction still survives in what 
Mondorf describes as ‘abstract’ environments where the nouns in the directional 
NPs refer to abstract entities, as in (12): 


(12) Alex worked himself into a crimson-faced rage. 


Based on an analysis of the competition of both constructions in four time periods 
between 1470 and the present, Mondorf (ibid.: 412) concludes: 


The way-construction consistently has a higher proportion of concrete rather than abstract 
uses throughout all four periods. By contrast, reflexive self scores consistently lower on 
concrete than abstract meanings. This distribution is indicative of a division of labor. Con- 
crete uses are a domain of way, but abstract ones are more closely associated with self. The 
emergent substitution of self by way is more advanced in the concrete domain. In particular 
with abstract meanings, self can still to some extent stand its ground. But even here it is 
continually declining in use. 


Unfortunately, Mondorf (2011) does not provide many examples for the abstract 
uses. A Google-search* for combinations of work (one of the verbs in Mondorf’s 


4 Google-search “works herself into” — 170 hits, first 40; Google-search “works her way into” — 
180 hits, first 40, conducted on February 4", 2018. 
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survey) with either (her)self or way in present-day English helps refine the charac- 
terization offered by Mondorf. The search reveals that what seems to be at stake 
is the difference between (concrete or abstract) motion along a path (in the way- 
construction) and subject-internal change (in the reflexive construction)? rather 
than the opposition between concrete and abstract entities in the directional NP. 


Table 1: works her way into vs. works herself into: Google-search 


her way into herself into 

works [motion towards concrete location]: [internal change]: 
her father’s library, a prison cell, the back the ground, a state, a frenzy, an 
row, a home, the nest orgasm, a minor frenzy, a (fine) lather, 
[motion towards more abstract location]: a frenzy of grief, enthusiasm, a 
a review, a show, the picture, Criminal Girls 2; religious ecstasy, a light daze, a fit, a 
[motion in rankings, sport contexts]: state of enthusiasm, a frenzied state, 


WNBA, Bistaff, the lead in Mumbai, starting some blisters, a tantrum, a panicked 
line-up, more lineups, varsity lineup, bigger frenzy, exhaustion, a panic, a tizzy 
role for Buffs, into the final round thinking 

[emotional, social upward movement]: 

the hearts of her adoptive parents, his heart, 

the Allanson household; her neighbor’s 

graces; the industrial segment of town, high 

society, Karl’s circle, New York’s upper 

echelon, prominence, increasingly powerful 

positions; 

[internal change]: 

relaxation; an elite swimmer 

[other]: 

an army marching chant, the piano’s low 

notes 


The reflexive resultative construction in present-day English therefore can be said 
to denote resultative, internal changes of state of the grammatical subject, whereas 
the way-construction is primarily associated with motion (i.e. traversal along a 
path), be it concrete or abstract. We will return to this observation in section 3. 


5 Both constructions are also compared in Christie (2011). As one of the differences between 
them, Christie discusses the inability of the prepositional phrase in the reflexive resultative con- 
struction to denote a path. 
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2.2 Dutch analogues of the way-construction 


The Dutch analogues of the way-construction have been addressed by Verhagen 
(2002, 2003), van Egmond (2006) and Poß (2010). The main findings can be sum- 
marized as follows. First, contrary to what is claimed in Goldberg (1995), there is 
a formally similar Dutch analogue containing the NP een weg ‘a way’, as in (13). 


(13) Ze baant zich een weg door de menigte. 
‘She makes her way through the crowd’ 


Still, this immediate analogue differs from the English way-construction in two 
respects: it contains a weak reflexive pronoun zich in indirect object position 
(instead of a possessive pronoun within the weg-NP as is the case in English) and 
it is much more strongly tied to the verb banen than its English pendant, i.e. the 
type frequency of the verb slot seems to be considerably higher for the English 
way-construction than for the Dutch weg-construction (compare Verhagen 2002, 
2003 with Perek 2018). Let us go into these two differences in more detail. 

Verhagen (2002, 2003) shows that the Dutch weg-construction has its origins 
in a ditransitive construction with the verb banen ‘to flatten, to level’, an NP con- 
taining the noun weg (either definite or indefinite) in direct object position, anda 
benefactive indirect object, which could, but did not need to be co-referential 
with the subject, see (14). 


(14) Koomt gy my  eenweg totgrooterdroefheid baanen? 
cometh thou me away to greater sorrow pave 
‘Are you coming to pave me a way to greater sorrow?’ 
(Verhagen 2002: 423) 


According to Verhagen (2002), parts of this construction became entrenched over 
time, resulting in the formation of the highly specific schema zich een weg banen 
door ‘pave oneself a way through’, in which both the indefinite NP een weg ‘a way’ 
and the preposition door ‘through’ are fixed elements. However, some generaliza- 
tion takes place as well, resulting in the storage of the higher order schema [V zich 
een weg OBL], in which other verbs and other prepositional phrases can occupy the 
verb and the directional slot respectively. So the Dutch weg-construction became 
compatible, not only with banen and semantically similar verbs of path creation, 
but also with more abstract verbs denoting the means of creating a (metaphorical) 
path such as bluffen ‘bluff’, vechten ‘fight, vreten ‘eat’, zingen ‘sing’, etc. 

It is important to emphasize that in combination with verbs like bluffen ‘bluff’ 
or zingen ‘sing’, the reflexive pronoun can no longer be interpreted as benefactive. 
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This is shown by the fact that a periphrasis by means of a voor-phrase seems to be 
impossible, whereas this is still, albeit only marginally, acceptable with zich banen, 
asin (15). Poß (2010) also stresses the strong preference for the weak reflexive zich 
in the Dutch construction (instead of strong zichzelf, cf. (16)), a clear indication of 
the fact that the reflexive “is not a semantic argument of this construction, but 
merely a formal position that needs to be filled” (Po 2010: 91). 


(15) God baant voor zichzelf eenweg in onze geschiedenis. 
God makes for himself a way into our history 
(http: //opvoedkundelav.khleuven.be/DIDACTIEK/2REFERENTIEKADER/ 
OPVPROJ/VSKOopvproject.htm) 
‘God makes for himself a way into our history’ 


(16) Ze zong (*voor zichzelf)/ zich eenweg naar de top. 
She sang *for herself / herself a way to the top 
‘She sang herself to the top’ 


In present-day Dutch, the preferred and most frequent verb in the construction is 
banen (Verhagen 2002: 412). This holds for both Netherlandic and Belgian Dutch, 
as Table 2 shows.° We can also infer from Table 2 some of the characteristics of the 
verbs: their subjects are agents performing an intended action. (Manner of) motion 
verbs are also allowed (e.g. dansen ‘dance’, kronkelen ‘twist’, wurmen ‘wriggle’), 
as are verbs that clearly do not express motion (e.g. zingen ‘sing’, vreten ‘eat’). 


Table 2: The verb slot in the Dutch weg-construction 


[zich een weg V OBL] Volkskrant (NL) CONDIV (B) 
(newspapers) 
n: 43 n: 24 
banen ‘make’ 23 13 
vechten ‘fight’ 5 1 
snijden ‘cut’ 3 7 


6 The Netherlandic Dutch data from the second column (Volkskrant) are from Verhagen (2002: 
412), the Belgian Dutch data are based on occurrences of the weg-construction in the Belgian 
Dutch newspaper part of the written CONDIV-corpus, searched using AntConc. The search string 
was zich een weg; irrelevant examples were removed manually. 
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[zich een weg V OBL] Volkskrant (NL) CONDIV (B) 
(newspapers) 
n: 43 n: 24 

bluffen ‘bluff’ 2 0 

vreten ‘eat’ 2 1 

wurmen ‘wriggle’, boren ‘drill’ 1 1 

beitelen ‘chisel’, graven ‘dig’, knagen ‘gnaw’, 1 0 

kronkelen ‘twist’, ploegen ‘plough’, slaan ‘hit’ 

bijten ‘bite’, dansen ‘dance’, zingen ‘sing’, kreunen 0 1 


‘moan’, ploeteren ‘to plug’, wroeten ‘root’, roefelen 
‘poke, browse’ 


The other Dutch analogue of the English way-construction is the so-called Tran- 
sition-To-Location Construction (in the following TLC), which contains a weak 
reflexive pronoun and a directional phrase, but no weg-NP. 


(17) a. Zezingt zich naar de top. 
‘She sings her way to the top’ 


b. De kankercel vreet zich door het lichaam. 
‘The cancer cell eats its way through the body’ (Po 2010: 98) 


Examples (17a-b) show that the TLC - while denoting some kind of motion - does 
not itself have to contain a motion verb. In fact, the verbs occurring in the TLC are 
similar to those in the weg-construction.’ They denote volitional action on the 
part of an agentive subject. Note that the verb banen is possible in the TLC as well, 
as shown in (18a-b), although banen occurs considerably more often in the way- 


construction than in the TLC. 


(18) a. Iedereen springt snel in de voertuigen en baant zich met loeiende 


sirenes door het drukke verkeer. 


‘Everyone quickly jumps into the vehicles and, with roaring sirens, 
works his way through the busy traffic’ 
(www.bol.com/nl/serie/playmobil-brandweer/9200000045845976, 


last accessed: 17-7-2019) 


7 Van Egmond (2006: 91) notes that 86% of the verbs she found in her corpus of TLCs also occur 


in the weg-construction. 
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b. De rivier baant zich door de bourgondische bossen van het natuurpark. 
‘The river runs through the Burgundian forests of the natural park’ 
(www.raftingadventure.be/raften-morvan-bourgogne, last accessed: 
177-2019) 


Both van Egmond (2006) and Poß (2010) compare the Dutch way-construction 
with the TLC. Whereas van Egmond (2006) highlights the differences, Poß (2010) 
seems to focus more on their similarities. For van Egmond, the main difference 
between the Dutch weg-construction and the TLC is that the former describes the 
incremental traversal of a path by means of or during the action described in 
the verb, whereas the TLC denotes the transition to a location? by means of the 
action denoted in the verb, but without motion along a path. The TLC is therefore 
obligatorily telic (i.e. the endpoint is reached), whereas the weg-construction, 
which focuses more on the path-traversal, is not specified for telicity (van Egmond 
2006: 104). Another difference pertains to temporal structure. Both constructions 
evoke two temporal sub-events, i.e. the actual or metaphorical motion on the one 
hand, and the means of motion, as described by the verb, on the other hand. With 
the weg-construction, the sub-events occur simultaneously, whereas the TLC 
allows interpretations in which the action denoted by the main verb and the 
motion, i.e. the transition to a particular location, are consecutive (van Egmond 
2006: 102). A third difference is that only the weg-construction expresses incremen- 
tal progress along a path. The TLC, by contrast, does not evoke an incremental 
reading: the location in the directional phrase can be reached by a single instance 
of the action described in the verb (van Egmond 2006: 103). 

It is clear that the expression of a path - by means of the NP een weg — has a 
considerable impact on the meaning of the construction in van Egmond’s account. 
Poß (2010), on the other hand, plays down the semantic contribution of the NP 
een weg: “the weg-element does not play a role in the [...] semantic structure of 
the construction [and therefore] the entire ‘creating and traversing a path’-part 
can be skipped” (ibid.: 99). For Poß, both constructions mainly differ with regard 
to their aspectual properties: whereas the weg-construction evokes either an iter- 


8 Van Egmond (2006: 114f.) argues that the TLC cannot be equated with the so-called (fake) 
reflexive resultative construction, in spite of their strong formal resemblance or even formal 
identity (e.g. Ze schreeuwde zich hees ‘She yelled herself hoarse’, ze zong zich in trance/aan 
flarden/te pletter ‘She sang herself into a trance/to pieces’/ze zoop zich in coma ‘she boozed 
herself into a coma’). According to her, the reflexive resultative denotes a transition to a state that 
does not exist independently of the subject, whereas the location the subject moves to in the 
TLC exists without the subject. This is strongly reminiscent of the English reflexive resultative 
construction, which, as we argued in section 2.1, denotes a subject-internal change of state. 
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ative reading (with verbs denoting punctual events, e.g. springen ‘jump’) or a 
durative one, it is not specified for telicity. The TLC, on the other hand, is neces- 
sarily telic, but not specified with respect to either iterativity or durativity. When 
a telic reading is evoked by the directional phrase, e.g. naar de finale ‘into the 
finals’ in (19), and the situation denoted is not punctual, the actual difference 
between the two constructions can be quite small. 


(19) Ze danste zich (een weg) naar de finale. 
‘She danced her way into the finals’ 


Still, there is a functional difference between the two constructions, and this is 
reflected in the frequency with which they code particular constellations. The 
inherent telicity of the TLC accounts for the fact that the construction is preferably 
used to express situations that are strongly goal-directed (e.g. in combination with 
the directional preposition naar ‘to’), whereas the weg-construction is more com- 
mon to denote situations in which the subject (metaphorically) moves along a path 
(e.g. in combination with the path-denoting prepositions door ‘through’ or over 
‘over’). Table 3 presents the results of several Google-searches involving either more 
strongly goal-directed (telic) or path-oriented constellations with the same verbs. 


Table 3: different preferences for weg and the TLC? 


V REFL 


danst/dansen/danste/dansten zich 
‘dances/dance/danced REFĽ 


+ GOAL 
naar ‘to’ 


weg: 99 vs. TLC: 143 


+ PATH 
door ‘through’ 
over ‘over’ 


weg: 92 vs. TLC: 61 
(door) 


bokst/ boksen/bokste/boksten zich 
‘boxes/box/boxed REFL’ 


weg:21 vs. ILC: 76 


weg: 10 vs. ILC: 36 
(door) 


zwemt/zwemmen/zwom/zwommen zich 
‘swims/swim/swam REFL’ 


weg:10 vs. TLC: 147 


weg: 9 vs. TLC: 6 
(door) 


zingt/zingen/zong/zongen zich 
‘sings/sing/sang REF’ 


weg: 62 vs. TLC: 78 


weg: 68 vs. TLC: 61 
(door) 


vecht/vechten/vochten zich 
‘fights/fight/fought REFL’ 


9 Google-search, conducted May 25", 2018. 


weg: 197 vs. TLC: 265 


weg: 24 vs. TLC: 15 
(over) 
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In each case, the goal-directed constellation favors the TLC, whereas the path- 
oriented constellation occurs more frequently in the weg-construction.”° These 
findings also account for the fact that Dutch speakers sometimes prefer the TLC 
in cases where English uses a way-construction. A nice illustration is provided by 
the following small-scale contrastive study based on Nicci French’s novel Sunday 
Morning Coming Down (translated into Dutch as Zondagochtend breekt aan)." 
The English original contains 19 instances of the way-construction, five of which 
are translated by means of a weg-construction in Dutch (20a). Additionally, two 
instances are translated using a TLC (20b).” 


(20) a. Josef munched his way through the pile of food in front of him, occa- 
sionally wiping his hand across his mouth. (French 2017a, 61, 10) 
Josef kauwde zijn weg door de berg eten voor zich en veegde zo nu en 
dan met zijn hand over zijn mond. (French 2017b, 61, 8) 

b. Now he can’t stop imagining a worm softly winding its way down the ear 

and into his head. (French 2017a, 35-36, 1) 
Nu kon hij de gedachte dat er een worm in zijn oor zit en zich langzaam 
naar binnen wringt niet uit zijn hoofd zetten. (French 2017b, 35-36,1) 


Fully in line with the analysis presented here, four out of five weg-constructions 
in the Dutch translation contain the preposition door, whereas both TLCs contain 
goal-directed naar. 

A third alternative rendering English way-constructions into Dutch is by means 
of a simple intransitive motion verb, as in the following three randomly selected 
examples from various sources. As this option (which to our knowledge has not 
been discussed in the literature so far) is quite straightforward, we will not further 
discuss it in the remainder of this paper. 


10 The only exception is the preference for boksen ‘to box’ to occur in combination with the 
path-denoting preposition door in the TLC. A closer look at the hits, however, reveals that many 
TLC-examples involve the telic particle heen, which emphasizes the endpoint of the path and 
hence induces a telic interpretation, as in [hij] bokste zich door een burn-out heen ‘he overcame a 
burn-out’ (lit. ‘he boxed REFL all the way through a burn-out’). 

11 Electronic versions of both the English original (French 2017a and its Dutch translation (French 
2017b) were used. 

12 In the remaining cases, other construction types occurred in Dutch, the majority of which 
involved the use of intransitive verbs. 
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21) a. Pm sorry, Mr Feltz, they just pushed their way in (Fargo 3, episode 2, The 

principle of restricted choice) 
Het spijt me, Mr Feltz, ze banjerden gewoon binnen [Dutch subtitles] 

b. The actual work starts once the sun has set. He then forces his way into 
the locus delicti of his choice, armed with tripod and camera. 
Het eigenlijke werk begint als de zon onder is. Dan dringt hij gewapend 
met statief en camera binnen bij de locus deliciti [sic!] van zijn keuze. 
(https: //www.hanswilschut.com/en/blog/2012/12/07/the-miami-series-hans- 
wilschut-edo-dijksterhuis/, last accessed 23-1-2018 [English and Dutch 
version of the website]) 

c. It took longer for Karlsson to edge his way out of the car and raise himself 
on to his crutches. (French 2017a, 1, 8) 
Bij Karlsson duurde het langer voordat hij was uitgestapt en zich op zijn 
krukken had gehesen. (French 2017b, 1, 7) 


To summarize, Dutch features three constructions rendering English way-con- 
structions: (i) a weg-construction, which resembles English way in typically evok- 
ing the notion of moving along a path, (ii) the so-called Transition-To-Location 
Construction, which is reflexive in form but does not contain a weg-NP and gets 
a telic interpretation, and (iii) an intransitive construction with a (mostly non- 
reflexive) movement verb. 


2.3 German analogues of the way-construction 


At first sight, German seems to have only one straightforward option to render the 
English way-construction, i.e. by means of a reflexive construction containing a 
(normally non-reflexive) verb, a weak reflexive pronoun sich and a directional 
phrase, e.g. durch die schmale Straße in (22a), ins Finale in (22b), and zum Sieg 
in (22c). 


(22) a. Einriesiger Bagger gräbt sich durch die schmale Straße. 
‘A giant excavator digs its way through the narrow street’ 
b. Sie schldgt sich ins Finale. 
‘She hits her way into the final’ 
c. Erstöhnt sich zum Sieg. 
‘He moans his way into victory’ 


As argued by Smirnova/Mortelmans (subm.), there are no compelling reasons to 
assume the existence of a (schematic, non-compositional) Weg-construction in 
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German. A corpus study (ibid.) reveals that instances of the pattern with the noun 
Weg are indeed very rare and occur almost exclusively with the verb bahnen ‘to 
pave’. Most importantly, the pattern can be given a compositional interpretation, 
with the reflexive pronoun expressing a benefactive indirect object and the Weg-NP 
a straightforward direct object, as e.g. in (23): 


(23) Ich bahne mir meinen Weg durch Smog und Baustellenlärm ... (taz, 17-9-2009) 
‘I pave (me.DAT) my way through the smog and the noise of the construc- 
tion site ...’ 


In contrast to the Dutch weg-construction, there is still considerable variation with 
respect to the form of the Weg-NP: as shown in Table 4, we find combinations 
with the definite article, with the indefinite article and with possessive pronouns. 
The combination of sich with a possessive pronoun (seinen/ihren) seems to be 
the preferred one, hinting at a certain degree of entrenchment and conventional- 
ization. This combination of a reflexive pronoun with a co-referential possessive 
pronoun is odd in present-day German (compare Ich wasche mir die Hände with 
"Ich wasche mir meine Hände). This suggests that the reflexive pronoun is losing 
some of its purely reflexive meaning in the construction. 


Table 4: Weg-modifiers in the German Weg-construction 


DEREKo sich seinen/ihren Weg sich den Weg sich einen Weg 
(W-6ffentlich) ‘REFL his/her way’ ‘REFL the way’ ‘REFL a way’ 
bahnt (form) seinen: 356; ihren: 181 237 151 

bahnen (lemma) seinen: 531; ihren: 686 530 346 


As bahnen is the only verb occurring in this construction, the construction may 
be considered a fixed idiomatic expression with a high degree of entrenchment 
and lexicalization. This is the fundamental difference between the German Weg- 
construction and its Dutch and English counterparts: whereas the latter display a 
medium to high degree of schematicity and productivity and allow a variety of 
verbs in the verbal slot, the former is restricted to one verb only. 

The German reflexive construction as exemplified in (22) above, on the other 
hand, may be regarded as a non-compositional, schematic, and productive con- 
struction in present-day German (cf. Smirnova/Mortelmans subm.). The verb slot 
may be filled with verbs from different semantic groups and syntactic classes, for 
example transitive verbs such as graben ‘dig’ in (22a), schlagen ‘hit’ in (22b) and 
lesen ‘read’ in (24a), as well as intransitive verbs such as stöhnen ‘groan’ in (22c) 
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and spielen ‘play’ in (24b). In each case, the construction describes the traversal 
of a (metaphorical) path by means of or during the action described by the verb, 
although the verbs do not express motion. 


(24) a. Ichhabe mich jetzt durch den kompletten Thread gelesen ... (DECOW14) 
‘Now I have read my way through the complete thread’ 
b. _ ... Musik, die sich da luftig und leicht in die internationalen Top 10 spielte. 
(DECOW14) 
‘music which played its way into the international Top 10 lightly and 
softly’ 


The German reflexive construction may receive the same two interpretations that 
are typical of its English and Dutch counterparts, i.e. telic (22b, 22c; 24a, 24b) or 
unspecified with respect to telicity (22a). These readings are mostly dependent on 
the larger context of the utterance and on the preposition used in the directional 
phrase. The corpus study in Smirnova/Mortelmans (subm.) reveals that preposi- 
tional phrases with durch ‘through’ usually favor the atelic interpretation, whereas 
other prepositions such as in ‘in, into’, zu ‘to’ etc. are more compatible with the 
telic interpretation, implying that the goal has been reached. The reflexive con- 
struction in German may therefore be regarded as a counterpart to the English 
way-construction as well as to both Dutch constructions, i.e. the weg-construc- 
tion and the TLC. 

It must be noted, however, that the German reflexive construction is often 
difficult to delineate from reflexive patterns in which a verb is accompanied by a 
weak reflexive and a directional phrase, cf. e.g. (25). Some of these verbs obliga- 
torily combine with a weak reflexive (see below) and a directional phrase, making 
it difficult to draw a line between a non-compositional reflexive construction as a 
counterpart of the English way-construction on the one hand, and a lexical reflex- 
ive verb with a directional phrase in its subcategorization frame on the other 
hand (see Smirnova/Mortelmans subm. for detailed discussion). 


(25) a. Er stürzte sich aus Verzweiflung in das Ägäische Meer. (DWDS-Kern- 
korpus, schwanit1999) 
‘In despair, he plunged into the Aegean Sea’ 
b. Aida, die sich in die Gruft geschlichen hatte ... (DWDS-Kernkorpus, 
oper1998) 
‘Aida, who slipped her way into the tomb’ 


In section 2.1, we noted that manner of motion verbs frequently appear in the 
English way-construction. As typical representatives of this category, Perek (2018) 
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notes the following verbs, for which the Oxford Duden German dictionary provides 
reflexive translations: edge (German sich schieben), thread (German sich schlän- 
geln), trail (German schleifen, sich hinziehen), weave (German sich schlängeln) 
and wind (German sich winden, sich schlängeln). In general, German has a much 
higher number of lexicalized or inherently reflexive verbs, often associated with 
motion, e.g. sich begeben ‘go’, sich verkriechen ‘sneak away’, sich verkrümeln 
‘sneak off’, sich trauen ‘dare to go’, sich wagen ‘dare to go’, etc. And indeed, as 
already pointed out in Pedersen (2013), the English way-construction is mostly 
translated into German using an inherently reflexive verb rather than a reflexive 
construction. This is demonstrated by Pedersen (2013) on the basis of a parallel 
corpus consisting of original English texts and their translations into Spanish, 
German and French. (26) shows examples from English and their German trans- 
lations (ibid.: 245-246): 


(26) a. Ehe barged his way past Hillary Clinton 
G konnte er sich an Hillary Clinton vorbeidrängeln 
b. Eifyou try to bob and weave your way ... towards an end game 
G wenn man versucht, sich ... auf ein Endspiel zuzuschlängeln 
c. Ewe may be able to manage our way through it 
G sind wir vielleicht in der Lage, uns hindurchzumanövrieren 


Above, we argued that the German reflexive construction [V sich OBL] may be seen 
as a counterpart of the English way-construction. In fact, the situation appears to 
be more complicated than that: in view of the high number of German inherently 
reflexive verbs, together with the fact that the English way-construction is often 
translated into German using such an inherently reflexive verb, as illustrated in 
(26), we can conclude that there are two different construction types in German to 
render the English way-construction. On the one hand, there is the schematic and 
relatively productive reflexive construction as illustrated in (22) and (24) above, 
which accommodates non-reflexive verbs in its verb slot. On the other hand, there 
are numerous lexical verbs which obligatorily feature a weak reflexive, as exem- 
plified in (25)-(26). As both types of constructions, i.e. the schematic one and the 
substantial lexical ones, resemble each other in form and semantics, the bound- 
ary between them is not always clear. 

In the next section, we will look more closely into the semantics of auto- 
causative motion, a conceptual domain in which the English, Dutch, and German 
constructions described above seem to occupy different regions. 
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3 Autocausative motion 


3.1 Middle situations 


In this section, we will focus on the conceptual semantics of middle situations, 
using mainly the typological distinctions introduced by Kemmer (1993). Under 
such an onomasiological approach, well-known from typological research on 
semantic maps, functionally equivalent constructions from different languages 
are compared as to their positions in the same semantic space. The English way- 
construction and the analogous reflexive constructions in Dutch and German 
encode middle situations, i.e. situations in which the initiator of the event (i.e. 
the subject) is also its endpoint (ibid.: 337). In contrast to straightforward agent- 
patient situations, the patient is therefore co-referential with the actor, or in the 
words of Maldonado (2009: 70), “the subject’s action cannot be distinguished from 
the object’s affectedness” and the event “remains in one participant” (ibid.). The 
way-construction and its analogues specifically encode middle situations within 
the conceptual sub-domain of motion events such that the mover (the grammati- 
cal subject) is also the participant that undergoes motion, i.e. is affected by it. 

In the semantic map of middle situations proposed by Kemmer (1993: 202) it 
is the dimension of autocausative motion that is particularly important for our 
purposes, cf. Table 5: 


Table 5: Reflexive — middle cline (Kemmer 1993: 224) 


REFLEXIVE NON-TRANSLATIONAL CHANGE IN BODY POSTURE TRANSLATIONAL 
SITUATION MOTION MOTION 
nod, bow, turn, stretch sit down, stand up go, move, climb 
two-participant events one-participant events 


The cline in Table 5 represents the conceptual continuum of autocausative situa- 
tions. In an autocausative situation a referent (usually human or animate) per- 
forms an action and undergoes a change of state at the same time. Autocausative 
situations are often motion events, and Table 5 shows a continuum of motion 
events in the domain of autocausative situations. 

The continuum covers the semantic space between reflexive situations on the 
left, with two distinguishable but co-referential participants, and translational 
motion on the right, representing ordinary intransitive situations like go, move, 
climb, fly etc., in which only one participant is involved. The domain of non-trans- 


64 —— Tanja Mortelmans/Elena Smirnova 


lational motion with events like nodding, bowing, turning, stretching is located 
near the reflexive pole; in these events, the active participant, e.g. a person, may 
still be distinguished from the participant on which the action is performed, 
which is usually the participant’s own body or parts of it. Further towards the 
one-participant end of the cline, there is the domain of body posture change with 
situations like sit down, stand up, lay down etc. in which it is even more difficult 
to distinguish between active and affected participants. 

From a typological point of view, languages differ in how they employ differ- 
ent markers along this continuum. The right part of the table is likely to be coded 
by simple intransitive verbs, the left part of the continuum is likely to be coded by 
transitive verbs with reflexive morphemes signaling the co-reference of the par- 
ticipant roles. 

In the next two sections, we will look at reflexive markers and the analogues 
of the way-construction in Dutch and German with respect to how they cover the 
semantic space represented in Table 5 above. 


3.2 Weak reflexives in the autocausative domain 


It is well known that many languages distinguish between strong and weak forms 
of the reflexive morpheme (cf. Kemmer 1993; Steinbach 2002). Dutch has both, 
German has only one form which serves both functions, and English has only the 
strong form: 


(27) weak form strong form 
E - -self 
G sich u sich 
D zich zichzelf 


In Dutch, the strong forms are used in typical reflexive situations, see (28), whereas 
the weak forms occur with lexicalized inherently reflexive verbs, see (29a), with 
verbs of grooming (29b) and in some middle situations (see below). 

(28) ik hoor mezelf G ich hore mich 
I hear myself 


be ashamed 
zich kammen G sich kämmen 
comb 


D 
E 

(29) a. D zich schamen G sich schämen 
E 
b. D 
E 


Analogues of the way-construction in German and Dutch — 65 


Unlike Dutch, German is a one-form language, as it does not distinguish between 
weak and strong reflexive forms (cf. Steinbach 2002: 47). In German, the reflexive 
pronoun sich is thus used not only in typical reflexive situations as in (28), but also 
with inherently reflexive verbs in (29). English makes use of a morphologically 
strong reflexive (herself etc.) that is not in opposition with a weak one. Generally, 
the use of the strong reflexive form in English is restricted to typical reflexive situa- 
tions, see (28) (but see Siemund 2010, 2014 for exceptions). 

The West-Germanic languages differ with respect to the middle properties of 
the reflexive markers. For instance, the middle properties of German sich are 
considerably more pronounced than those of its English and Dutch counterparts, 
especially in the domain of motion events. With respect to the continuum of the 
middle situations introduced in the previous section (see Table 5), the German 
reflexive marker sich is used in all sub-domains, cf. (30). The verbs given in (30) 
are lexicalized reflexive verbs; the weak reflexive pronoun is obligatory and can- 
not be omitted (without a considerable change in meaning). 


(30) NON-TRANSLATIONAL MOTION 
sich biicken, sich (ver)beugen, sich (um)drehen, sich (aus)strecken 


CHANGE IN BODY POSTURE 
sich (hin)setzen, sich (hin)legen, sich erheben, sich anlehnen 


TRANSLATIONAL MOTION 
sich bewegen, sich begeben, sich entfernen, sich ndhern 


In Dutch, the situation is more complicated, cf. (31). The weak reflexive zich occurs 
in all sub-domains of autocausative motion, but its use is by no means obligatory 
and subject to variation. Verbs denoting non-translational motion like bukken 
‘stoop’ or buigen ‘bend’ for example mostly occur with a reflexive marker, but 
non-reflexive uses are also found, depending on the context. In the subdomain of 
translational motion, on the other hand, non-reflexive intransitive uses seem to 
be dominant, but reflexive uses are not completely excluded (zich bewegen, zich 
begeven). 


(31) NON-TRANSLATIONAL MOTION 


De aarde draait rond de zon. Hij draait zich om in zijn graf. 

‘The earth revolves around the sun’ ‘He turns over in his grave’ 

Ik kan niet meer bukken Ze bukte zich om de post op te rapen. 
‘I cannot bend anymore’ ‘She bent down to pick up the post’ 
Hij buigt voor de koning. Hij buigt zich naar voren. 


‘He is bowing to the king’ ‘He is bowing forward’ 
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CHANGE IN BODY POSTURE 
gaan zitten, gaan liggen zich neerzetten, zich neerleggen 
‘to sit down, to lie down’ ‘to sit down, to lay down’ 


TRANSLATIONAL MOTION 

Hij kon niet meer bewegen De danser bewoog zich over het 

‘He couldn’t move anymore’ podium. 
‘The dancer moved across the stage’ 
Hij begeeft zich naar huis. 
‘He goes home’ 


Duinhoven (2001: 109) argues that the presence of the reflexive marker is associ- 
ated with intentionality: if the subject of the sentence is conceptualized as an 
intentional actor (especially in the domain of translational motion), the reflexive 
is used. According to Oya (2003: 220), the reflexive signals that the body is some- 
how affected by the action denoted by the verb. We would like to propose an addi- 
tional semantic component that accounts for the presence of the reflexive in the 
domain of autocausative motion, namely the goal-directedness of the coded situ- 
ation. If a reflexive marker is used in situations of non-translational motion, i.e. in 
situations which do not necessarily involve the movement of the subject towards a 
fixed endpoint (e.g. draaien ‘turn’), the situation is interpreted as directed towards 
the goal or the endpoint of the motion. Instances of the lemma draaien ‘turn’ 
accompanied by the reflexive pronoun zich” in the Corpus of Spoken Dutch 
(CGN'*) either involve the prefix om (zich omdraaien ‘to turn around’), denoting 
a fully completed turn, or a directional prepositional phrase (e.g. naar mij toe 
‘towards me’, op zijn buik ‘onto his belly’). Intransitive instances without zich, 
on the other hand, are perfectly compatible with situations that do not evoke an 
endpoint, as in the following examples, again taken from the CGN. 


13 The entire CGN was searched for the lemma draaien ‘turn’ accompanied by zich (distance: 
from 0 to 3 words). We found 140 relevant instances, 119 of which contain the separable prefix om 
‘around’. One instance contains the prefix weg ‘away’. In the remaining 21 instances, draaien 
combines with prepositional phrases introduced by naar ‘to(wards)’ (11 instances), op ‘on(to)’ 
(7 instances), in ‘in(to)’ (1 instance) and tot ‘towards’ (1 instance), all of them clearly indicating 
the goal of the movement. 

14 We conducted searches in the entire CGN, also including component o, (/comp-o/) which 
contains written material read aloud. In the corpus references, /nl/ refers to Netherlandic Dutch, 
/vl/ to Belgian Dutch (Flemish). 
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(32) a. jajaohjajanouzja dat zijn auto’s die draaien 
‘yes yes oh yes yes well yes that are cars that are running’ 
(CGN/comp-a/nl/f£n000747.sea#fn000747.94) 
b. kun je effe draaien en zoeken naar de stoel 
‘can you turn for a second and look for the chair’ 
(CGN/comp-a/nl/f£n008549.sea#fn008549.132) 


In a similar vein, reflexive instances of non-translational buigen ‘bend, bow’ in the 
CGN” typically combine with prepositional phrases indicating the movement’s 
endpoint (this is the case in 81 out of 82 instances in the CGN, see examples 
(33a-d)), whereas intransitive, non-reflexive buigen typically evokes situations 
in which the subject moves his or her body without focussing the movement’s 
endpoint or direction (see examples (34a-c)).'* This explains why reflexive sich 
buigen equals English bend down/over (towards), while non-reflexive buigen must 
be translated by simple bow or bend (typically to someone/something). 


(33) a. ik buig me iets naar voren. 
‘Tam bending forward a little’ 
(CGN/comp-o/nl/fn001200.sea#fn001200.123) 
b. Rufus’ oma links van hem boog zich naar ‘m toe. 
‘Rufus’ grandma on his left bent towards him’ 
(CGN/comp-o/nl/fn001564.sea#fn001564.41) 
c. m’nmoeder boog zich over me heen. 
‘My mother bent over me’ 
(CGN/comp-o/nl/fn001256.sea#fn001256.83) 
d. Daniel boog zich voorover |...] 
‘Daniel bent forward’ 
(CGN/comp-o/nl/fn001531.sea#fn001531.1) 


(34) a. hij boog toen ik hem complimenten maakte. 
‘He bowed when I paid him compliments’ 
(CGN/comp-o/v1/fv801022.sea#fv801022.6) 


15 The verb buigen occurs 235 times in the CGN, of which 82 instances are reflexive and 47 
intransitive and non-reflexive. 

16 This does not imply that intransitive non-reflexive buigen cannot occur with prepositional 
phrases like voorover ‘forward’ or naar voren ‘forward’. Such uses do occur, but less frequently 
than the reflexive use of the verb (buigt zich naar voren (7) vs. buigt naar voren (3); buigt zich 
voorover (14) vs. buigt voorover (9). 
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b. inde kwartfinale van de Heineken Trophy moest de Limburger met 
zes drie en zes vier buigen voor de Spanjaard Tommy Robredo. 
‘In the quarter finals of the Heineken Trophy the Limburger lost to 
[lit. had to bend to] his Spanish opponent Tommy Robredo with 6- 
3 and 6-4 (CGN/comp-k/nl/fn003726.sea#fn003726.2) 

c. z’neigen kinderen bogen voor hem en kusten z’n oude handen. 
‘His own children bowed to him and kissed his aged hands’ 
(CGN/comp-o/vl/fv800926.sea#fv800926.11) 


Likewise, sentences with reflexive bukken” ‘stoop, bend’ evoke situations in 
which the subject assumes a bent-down position in order to do something in this 
position. Not surprisingly, 10 out of 22 instances of zich bukken in the CGN are 
accompanied by a final clause introduced by om ‘in order to’, see examples (35a-b). 
Another typical example of zich bukken is (35c), in which reflexive bukken is accom- 
panied by a directional prepositional phrase (te ver over de reling ‘too far over the 
guard rail’). 


(35) a. ze bukte zich om de post op te rapen 

‘she stooped to pick up the mail’ 
(CGN/comp-o/nl/fn001214.sea#fn001214.24) 

b. als we naar de wei gingen dan moesten we ons bukken hé om niet aan 
die draad te komen. 
‘when we went to the meadow, then we had to stoop in order not to 
touch the wire’ 
(CGN/comp-a/vl/fv400204.sea#fv400204.256) 

c. daar buk je je te ver over de reling 
‘there you are leaning too far over the guard rail’ 
(CGN/comp-a/nl/fn007993.sea#fn007993.40) 


Non-reflexive bukken, on the other hand, is used to describe the movement an sich, 
as in (36a). Interestingly, 5 out of 13 non-reflexive bukken-instances occur as infini- 
tival complements of negated modal verbs (as in (36b-c)) and thus imply that the 
stooping position cannot, need not or is not intended to be achieved, whereas 
none of the 22 reflexive instances of bukken contains a negator. 


17 The verb bukken is attested 48 times in the CGN. Leaving adjectival uses aside, the verb is 
used 22 times in reflexive constructions, while it occurs 13 times as an intransitive non-reflexive 
verb. 
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(36) a. [..] ik was net aan m’n knie geopereerd en |...) ik mocht [...] niet bukken. 
‘I had just been operated on my knee and I was not allowed to stoop’ 
(CGN/comp-a/nl/fn000616.sea#fn000616.230) 

b. uh nou uh golfen is uh knikkeren voor grote jongens die niet willen 
bukken. 
‘uh well uh playing golf is uh (like) playing marbles for big boys who 
do not want to stoop’ 
(CGN/comp-a/nl/f£n000616.sea#{n000616.230) 

c. nou ‘t was er allemaal wel maar niet echt goed uh nagedacht en zij zegt 
ik wil dan ook een ijskast waarbij ik niet hoef te bukken [...] 
‘well everything was there but not really well considered and she says 
then I also want a fridge for which I don’t have to stoop’ 
(CGN/comp-a/nl/fn000640.sea#fn000640.51) 


We therefore argue that the function of the Dutch weak reflexive in the middle 
domain of motion is to mark the goal-directedness of an action. Since this is the 
core meaning of the reflexive, we suggest that the notions of intentionality, resul- 
tativity, and telicity often associated with the reflexive variants of the verbs in the 
domain of autocausative motion as in e.g. (31) represent possible extensions and 
context-induced interpretations of this core meaning. 

To conclude, in Dutch the weak reflexive can be used in situations of auto- 
causative motion, in which it is clearly associated with the notions of goal-directed- 
ness and resultativity, i.e. the very notions that are also evoked by the TLC. 


3.3 Autocausative motion in English 


In English, which lacks a weak reflexive form, the strong reflexive may be used in 
resultative constructions which resemble middle situations involving autocausa- 
tive motion to some degree, see (37) (examples are taken from Oya 2002). 


(37) a. Joggers often run themselves sick. 
b. Don’t expect to swim yourself sober. 
c. She danced herself into a frenzy. 


Usage of the English strong reflexive in this domain is extremely restricted, how- 
ever. It cannot be used, for example, in contexts involving motion in which the 
result phrase expresses the final location of the subject (Oya 2002: 976). In con- 
trast to German and Dutch, the reflexive is excluded in such cases, see (38), taken 
from Oya (2002). 
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(38) a. She danced/swam (*herself) free of her captors. 
Ze danste/zwom * (zich) vrij van haar ontvoerders. 
Sie tanzte/schwamm *(sich) frei von ihren Entfiihrern. 
b. A bantam chick kicks (*itself) free from its shell. 
Een kuiken slaat *(zich) vrij uit zijn schaal. 
Ein Kiiken kickt *(sich) frei von seiner Schale. 


If a reflexive is available, as illustrated in (37a-c), some metaphorical change of 
state of the subject is expressed which is possibly achieved through motion. This 
does not seem to involve any actual change of location, however. Indeed we 
observed in section 2 above that the reflexive construction in English denotes 
resultative, internal changes of state of the grammatical subject, whereas the way- 
construction is typically associated with motion along a path. 

For English, we thus conclude that the (strong) reflexive is hardly used in the 
domain of autocausative motion at all. In those very rare cases in which the reflex- 
ive is used, the situation described does not involve any motion along a path or 
towards a goal, but rather a change in the internal, emotional or mental state of 
the subject. 


English 
REF 


+ 


Dutch zich 


GOAL-DIRECTED 


German sich 


= PATH TRAVERSAL + 
NON-TRANSLATIONAL CHANGE IN BODY POSTURE TRANSLATIONAL 
MOTION MOTION 


Fig. 1: English, Dutch and German reflexives in the autocausative domain 


Summing up, we may now locate the reflexive markers of German, Dutch and 
English on the continuum of autocausative motion represented in Table 5 above. 
Figure 1 is a modified version of the same cline, involving two basic semantic 
dimensions. We have substituted the original participant dimension (one vs. two 
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participants) with the semantic component of movement or traversal of a path, 
where the domain of non-translational motion does not involve the traversal of a 
spatial path, whereas translational motion clearly involves the traversal ofa path 
towards a goal or an end point. In order to integrate the distinctive function of the 
Dutch reflexive, we also added the semantic component of goal-directedness to 
the table. 

In German, the weak reflexive is used in all sub-domains of autocausative 
motion; the whole conceptual space in Figure 1 is covered by German sich. In 
Dutch, the use of the weak reflexive is more or less restricted to contexts of 
goal-directed motion. The English strong reflexive -self, insofar as it occurs at all, 
is only used in contexts denoting resultative, internal changes of state of the 
grammatical subject without bodily motion in space towards some speaker- 
external goal. 


3.4 The way-construction and its analogues 
in the autocausative domain 


In this section, we will have a closer look at the Dutch and German analogues of 
the English way-construction. We will first relate the constructions to each other 
crosslinguistically and then compare the individual reflexive markers language- 
internally. 

As introduced in section 2, the English way-construction is basically asso- 
ciated with motion along a path, either spatially or metaphorically. This motion 
often requires some effort on the part of the subject referent, who overcomes 
obstacles by creating and traversing the path. Importantly, the English way- 
construction is not specified with respect to telicity, i.e. it is compatible with the 
goal being achieved or not. In German, there is only one analogue to the English 
way-construction, i.e. the reflexive construction (see section 2.2). Dutch (see sec- 
tion 2.1) has two different analogues to the English way-construction, and there 
is a division of labour between them. The weg-construction is not specified with 
respect to telicity and goal-directedness, i.e. it is not necessarily resultative. The 
TLC, on the other hand, is telic and resultative, i.e. it usually implies that the goal 
has been reached. 

The semantics of the English way-construction and their analogues in Dutch 
and German may be described in the same conceptual terms as the autocausative 
domain of middle situations. There is usually a human subject who intention- 
ally creates a path along which he or she moves towards a goal. Figure 2 7 below 
therefore uses the same conceptual space as Table 5 and Figure 1 to compare the 
constructions. 
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The English way-construction and the German reflexive construction [V sich 
OBL] cover the whole conceptual space of autocausative motion, as they convey 
meanings compatible with both [+/- goal directedness] and [+/— path traversal] 
situations. The English reflexive resultative construction is shown in the top left 
corner of Figure 2 (see section 2 and Figure 1). It usually refers to subject-internal 
change and has gradually been replaced by the way-construction, which now 
occupies the whole conceptual space of autocausative motion. The only remaining 
area in which the resultative reflexive construction survives is the region charac- 
terized by the features [- path traversal] and [+ goal-directed]. 

The German reflexive construction is similar to the English way-construction 
with respect to semantics. Its productivity is more restricted, however (see sec- 
tion 2.2), mainly due to the fact that many lexical verbs which otherwise could 
have been perfect candidates for the verbal slot of this construction are in fact 
lexicalized reflexive verbs. Hence, the highly productive lexical reflexive pattern 
of German blocks the schematic construction [V sich OBL]. Both constructions are 
therefore in competition for the same conceptual space of autocausative motion 
(compare Figures 1 and 2). 


English 
REFL English way 
German REFL 


+ 


Dutch TLC 


GOAL-DIRECTED 


Dutch weg-cxn 


= PATH TRAVERSAL + 
NON-TRANSLATIONAL CHANGE IN BODY POSTURE TRANSLATIONAL 
MOTION MOTION 


Fig. 2: Analogues of the way-construction in the autocausative domain 


As shown in section 2.1, the Dutch TLC and the weg-construction differ with 
respect to telicity and resultativity. The weg-construction describes the incremental 
traversal of a path by means of, or during, the action described in the verb, i.e. 
it is not specified for telicity. The TLC, on the other hand, denotes the transition 
to a location and implies that the endpoint is reached. These two constructions 
share the conceptual space of autocausative motion; they show a basic division of 
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labor, but also an area of overlap. The TLC is preferred in situations that are more 
goal-directed and not necessarily involve a path traversal, hence it is located in the 
top left area of the continuum in Figure 2. The weg-construction, on the other hand, 
is located towards the less goal-directed, more path-oriented end ofthe continuum. 

A corpus study by Smirnova/Mortelmans (subm.), however, detects some 
recent diachronic changes in the German reflexive construction which may be 
interpreted as early evidence of a (sort of) division of labour between the older 
lexical reflexive pattern and the more recent syntactic reflexive construction. 
During the 20% century, the reflexive construction seems to have developed a 
dominant sense of [+ path traversal] without an obligatory reference to areached 
endpoint. This development manifests itself for example in the increased relative 
frequency of atelic verbs filling the verbal slot of the construction, and also in the 
increased relative frequency of the preposition durch, which is becoming the domi- 
nant preposition in the construction, see (39). 


(39) a. [...] soll man sich jetzt wirklich durch 70.000 Seiten Material arbeiten 

(DECOW14) 
‘should one really work one’s way through 70,000 pages of material’ 

b. Autofahrer [...] quälen sich bislang durch den Ort. (DECOW14) 
‘drivers torture their way through the city’ 

c. Rüttgers nuschelte sich mal wieder durch hohle Phrasen (DECOW14) 
‘and again Rüttgers mumbled his way through empty phrases’ 

d. Ich habe mich natürlich durch alle Sorten durchprobiert (DECOW14) 
‘Needless to say I tested my way through all kinds’ 


This diachronic tendency is consonant with the directionality of change in English, 
where the older reflexive pattern was first replaced gradually by the new way-con- 
struction in the domain of [+ path traversal]. This tendency clearly holds across 
all three languages under consideration, as suggested by the continuum in Fig- 
ures 1 and 2. Whereas the original (more lexical) reflexive patterns are associated 
with the goal-directedness of a situation (i.e. with the top left part of the con- 
tinuum), the more recent constructions favor the interpretation of path traversal 
(i.e. the bottom right part of the continuum). This is most in evidence in English 
and Dutch, where the noun way/weg in the more recent constructions explicitly 
refers to a path, thus explaining the conceptual association of these constructions 
with the semantic notion of path traversal. It is less obviously, but still arguably, 
the case in German, where the newly developed reflexive construction is formally 
identical to the lexical reflexive patterns. Nevertheless, we observe the same ten- 
dency towards a division of labour, as the more recent construction tends to move 
towards the [+ path traversal] end of the continuum. 
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4 Conclusions and outlook 


In this paper, we have shown that the English way-construction and its various 
reflexive analogues in German and Dutch cover different portions of the domain 
of autocausative motion, and that this variation can be described by means of 
two semantic dimensions, i.e. goal-directedness (telicity) on the one hand, and 
path-traversal, on the other. 

From a more theoretical perspective, two conclusions can be drawn. First, the 
existence of close formal analogues in two languages (e.g. the weg-construction in 
Dutch and the way-construction in English) should not be taken to imply that these 
analogues are similar in functional terms as well. The English way-construction is 
functionally much more versatile than its Dutch counterpart, not only because ofthe 
greater variation in the verb slot, but also because the English way-construction has 
outcompeted the resultative reflexive construction and now occupies territory that in 
Dutch is taken up by the TLC. Whereas English speakers prefer the way-construction 
to describe the metaphorical motion event of reaching the finals, as in example (40), 


(40) He swam his way into the finals. 


Dutch speakers are more inclined to use the ‘resultative’ TLC in this context, in 
part because both subevents are consecutive (the swimming occurs first, reach- 
ing the finals is a later event). 


(41) Hij zwom zich naar de finale. 
‘He swam his way into the finals’ 


Second, we are proposing a distinction between schematic constructions (e.g. 
[V sich OBL]) in German) on the one hand, and more substantial lexical construc- 
tions (e.g. inherently reflexive verbs) on the other. Although the two may look 
superficially alike, instances of the latter do not necessarily instantiate the former 
(e.g. Der Zug bewegte sich zum Brunnen ‘The procession moved REFL towards 
the fountain’, where sich bewegen is inherently reflexive). Future research should 
focus on the interaction of the schematic reflexive construction with existing lexi- 
cally reflexive verbs in German (i.e. the more substantial constructions). The latter 
may well have facilitated the creation of the former, as the more abstract construc- 
tion schematizes over a relatively frequent formal pattern. On the other hand, it 
may just as well be hindering its full development, as the schematic reflexive con- 
struction constantly interacts with formally similar instances that do not actually 
instantiate it, but compete with it in the same conceptual domain (autocausative 
motion). Diachronic data in particular could shed more light on this issue. 
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Tom Bossuyt (Ghent) 

Lice in the fur of our language? German 
irrelevance particles between Dutch and 
English 


Abstract: The present paper compares the distribution of English -ever, German 
immer and/or auch, and Dutch (dan) ook in universal concessive-conditional and 
nonspecific free relative subordinate clauses (e.g. G. Was auch immer du willst 
‘Whatever you want’) and in their elliptically reduced versions (e.g. D. ... of wat 
dan ook ‘... or whatever’). By combining large language-specific corpora such 
as the DEREKo, SoNaR, and BYU corpora with the smaller multilingual Conver- 
GENTiecorpus, 38,748 instances were obtained while maintaining comparability. 
Whereas present-day English has only one option in both clausal and elliptical 
constructions, viz. WH-ever, Dutch and German show more variation: in Dutch, 
discontinuous W... ook is by far the most frequent option in subordinate clauses, 
while the complex particle dan ook is largely confined to elliptical constructions. 
In German subordinate clauses, immer in adjacency to the W-word is the most 
frequent option, thus corresponding to English WH-ever, but in elliptical construc- 
tions auch immer is predominates, thus corresponding to Dutch dan ook. 


Zusammenfassung: Der vorliegende Beitrag vergleicht die Distribution von engl. 
-ever, dt. immer und/oder auch und ndl. (dan) ook in universalen Irrelevanzkon- 
ditionalen und verallgemeinernden Relativsätzen (z.B. engl. Whatever you want 
‘Was auch immer du willst’) sowie in ihren elliptisch reduzierten Varianten (z.B. 
ndl. ... of wat dan ook ‘... oder was auch immer’). Dank der Kombination großer 
sprachspezifischer Korpora wie des DEREKO, des SoNaR-Corpus und der BYU- 
Corpora mit dem kleineren mehrsprachigen ConverGENTiecorpus konnten 38.748 
Belege erhoben werden, wobei Vergleichbarkeit gewahrt blieb. Während im heu- 
tigen Englisch WH-ever sowohl in Nebensätzen als auch in elliptischen Konstruk- 
tionen die einzige Möglichkeit ist, zeigen das Niederländische und Deutsche 
mehr Variation: In ndl. Nebensätzen kommt das diskontinuierliche W ... ook am 
häufigsten vor, während sich die komplexe Partikel dan ook größtenteils auf ellipti- 
sche Konstruktionen beschränkt. In dt. Nebensätzen ist W-adjazentes, dem engl. 
WH-ever entsprechendes immer die häufigste Möglichkeit, in elliptischen Kon- 
struktionen dominiert aber das dem ndl. dan ook entsprechende auch immer. 


@ Open Access. © 2020 Bossuyt, published by De Gruyter. RIBAS This work is licensed under the 
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. 
https: //doi.org/10.1515/9783110668476-004 
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1 Introduction 


Reiners (1949: 283) referred to German modal particles as “die Läuse im Pelz un- 
serer Sprache” (lit. ‘the lice in the fur of our language’), dismissing these small 
words, such as e.g. mal and doch, as superfluous and not worthy ofthe attention 
of linguists (cf. Hentschel 2012: 124f.). In the meantime, the tide has turned and 
the amount of work published on modal particles has been overwhelming ever 
since (König 2010: 79; Müller 2017: 384). Other kinds of apparent “lice in the fur”, 
on the other hand, seem to have been mostly ignored. Among them are so-called 
“irrelevance particles”, i.e. quantificational particles which occur in universal 
concessive conditionals (henceforth: UCCs) in certain languages (Haspelmath/ 
König 1998): 


() a. English: Whatever he says, nobody listens to him. 


b. German: Was immer er auch sagt, jeder hört ihm zu. 
‘Whatever he says, everybody listens to him.’ 
c. Dutch: Wat Jan ook zegt, Marie luistert naar hem. 


“Whatever John says, Mary listens to him.’ 


Like all concessive conditionals, UCCs express a basic conditional meaning (König 
1986; Leuschner 2006; Breindl 2014). While prototypical conditionals express one 
antecedent value p in their protasis, which is followed by a consequent q in the 
apodosis ... 


(2) If the weather is nice today (= p), we’ll go hiking (= q). 


... concessive conditionals express a multiplicity of antecedent values (if p then q), 
whose individual truth values are irrelevant to the truth value of the consequent: 


(3) Whatever tomorrow’s weather is like (= p,), we’ll go hiking (= q). 
a. If tomorrow’s weather is A (= p,), we'll go hiking (= q). 
b. If tomorrow’s weather is B (= p,), we’ll go hiking (= q). 
Cir Ge 
d. Iftomorrow’s weather is X (= p,), we'll go hiking (= q). 
Since these values are ordered along a given parameter — e.g. the characteristics 
of tomorrow’s weather in (3) -, the protasis typically contains at least one contex- 
tually extreme condition p, which carries a presupposition to the effect that ~q 


rather than q would normally be expected to be true (König 1986: 234). E.g. under 
the condition that If there’s a blizzard tomorrow (= p), one would normally expect 
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we won’t go hiking (= ~q) to be true.’ This is why UCCs such as (3) often evoke a 
concessive interpretation, whence the epithet “concessive”. 

Despite the label “universal concessive conditionals”, the type of quantifier 
used in UCCs is different from standard universal quantification. Instead, it is more 
reminiscent of a “free-choice quantifier” (König/Eisenberg 1984: 315), whose effect 
is to allow the recipient to select a random value for the variable expressed by the 
WH-word in the protasis (König 1986: 231). In English at least, whatever and the 
other -ever-compounds are thus quantificationally more similar to free-choice any 
than to standard universal every or all, and it is precisely the “domain-widening” 
effect of any (Kadmon/Landmann 1993) that -ever contributes to the meaning of 
UCCs. With regard to German, the free-choice analysis seems to be contradicted by 
the fact that immer, usually seen as the equivalent of always rather than ever (e.g. 
Er kommt immer zu spät ‘He is always late’), is used as the counterpart of -ever in 
German UCCs (cf. 1a and 1b above). There is an elegant diachronic solution to this 
apparent riddle, however: immer, which is a partial cognate of ever, used to have 
both universal and free-choice temporal readings in earlier stages of German 
(i.e. both ‘at all times’ and ‘at any time’; Leuschner 1996). Its present-day use in 
UCCs is a residue of this earlier free-choice reading, but immer lost its temporal 
force when it was recruited as a quantificational particle in UCCs, retaining only 
the free-choice part of its semantics in this function (cf. ibid.: 481). 

Despite the etymological link between -ever and immer, the surface realization 
of UCC quantification in both languages is quite different overall. As can be seen 
in (1b) above, immer is not the only irrelevance particle in German. The other 
option, auch (lit. ‘also’), is etymologically identical to the Dutch irrelevance par- 
ticle ook. Moreover, -ever and immer seem to share their preference to occupy the 
position immediately adjacent to the WH- resp. W-word, whereas auch and ook 
seem to preferably occur further to the right in the subordinate clause. We are 
thus faced with a rather atypical “Germanic Sandwich”-pattern (cf. van Haeringen 
1956) in which German is situated between English and Dutch rather than Dutch 
between English and German. 

Since these differences and similarities have so far mostly gone unnoticed in 
the literature, it is the goal of the present paper to present the first contrastive 
corpus study on the distribution of the irrelevance particles -ever, immer and/or 


1 This extreme condition p, is the end point of a contextually salient scale which may be either 
canonical or inverted. A canonical scale is invoked by (1a) above, which can be read as ‘no matter 
how high the quality/amount of what he says, nobody listens to him’. By contrast, (1b) invokes 
an inverted scale: ‘no matter how low the quality/amount of what he says, everybody (still) 
listens to him’. I am grateful to two anonymous reviewers for pointing out this and other issues. 
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auch, and (dan) ook in UCCs and related constructions (cf. below). This study is 
thus a trilingual extension of a previous study by Bossuyt/De Cuypere/Leuschner 
(2018) on the patterns and frequencies of German immer/auch, which is based on 
a sample of 23,299 instances with was ‘what’ and wer ‘who’ (incl. inflectional 
forms), gleaned from the Deutsches Referenzkorpus (henceforth: DEREKO). The 
latter study is in itself a semi-replication of Leuschner’s study (2000) on immer/ 
auch, based on 104 instances from the Mannheimer Korpus, which contained 
ca. 2.2 million tokens. 

In order to obtain a large amount of sufficiently comparable data, large lan- 
guage-specific corpora were triangulated for the present study with the respective 
components of a small but comparable multilingual corpus (cf. section 2). All 
occurrences were analyzed for combinatorial variation on the one hand (answering 
the question which particle(s) is/are used) and positional variation on the other 
hand (which position(s) the particle(s) tend to occupy). Focusing on German and 
Dutch, section 3 presents the distributional patterns of irrelevance particles in 
subordinate clauses (3.1) and elliptically reduced constructions (3.2). A discussion 
on the similarities and differences between, first, Dutch (dan) ook and German 
auch (immer), and then between English -ever and German immer follows in 
section 4. 

This paper will argue that the synchronic distributions of the particles repre- 
sent a snapshot of the long-term emergence of irrelevance marking as a subsys- 
tem in each of the three languages, with varying degrees of grammaticalization. 
Whereas the grammaticalization of English WH-ever is more or less complete, the 
grammaticalization process of German W immer/auch subordinators seems to 
have lost its former directionality, resulting in a situation resembling a long-term 
“srammaticalization building-site” (Grammatikalisierungsbaustelle, Leuschner 
2006; cf. Nübling 2005). Finally, discontinuous Dutch W... ook shows only weak 
signs of grammaticalization. 


2 Methodology: corpus triangulation 


2.1 Corpora and search queries 


As mentioned above, this study combines data from very large, language-specific 
corpora (which are, however, barely comparable) with data from a small, but very 
comparable multilingual corpus. The goal of this methodology is to obtain a large 
amount of data while maintaining comparability. The following language-specific 
corpora were used: 
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— The “Archiv W” of the DEREKo is the main reference corpus of contemporary 
written German, containing approximately 9.2 billion tokens in total as of 
August 2018.” The corpus consists of a variety of text types, often from printed 
news media in Germany, Austria, and the German-speaking part of Switzer- 
land, recently supplemented with a considerable amount of Wikipedia articles 
and discussions as well as parliamentary minutes (Kupietz/Liingen 2014). 

— The SoNaR corpus is a 500-million-word reference corpus of contemporary 
written Dutch. It consists of both conventional media (e.g. newspapers) and 
new media (e.g. tweets, blogs, or chat conversations), and is fairly well-bal- 
anced between Dutch and Flemish texts (Oostdijk et al. 2013). 

— The BYU corpora are probably the most widely used online corpora for Eng- 
lish.* This study combines data from the BYU-BNC (100 million tokens of 
British English, 1980s-1993), COCA (560 million tokens of American English, 
1990-2017), Strathy Corpus (50 million tokens of Canadian English, 1970s- 
2000s), Wikipedia Corpus (1.9 billion tokens, 2012-13), and Hansard Corpus 
(1.6 billion tokens of British parliamentary minutes, 1803-2005), containing 
over 4.2 billion tokens in total. Combining these corpora somewhat mimics 
the composition of the DEREKo. 


The small but comparable multilingual corpus used for the present study is the 
ConverGENTiecorpus, which is institutionally available at Ghent University.’ It 
consists of seven subcorpora in English, Dutch, German, French, Spanish, Italian, 
and Portuguese, containing about 1.5 million tokens each. Comparability is guar- 
anteed, as all subcomponents contain approximately the same amount of tokens 
distributed over a wide variety of corresponding text genres. 

Search queries for the present study in the ConverGENTiecorpus included 
virtually all WH-words, as was the case in Leuschner’s original study, which, how- 
ever, referred exclusively to the German Mannheimer Korpus (Leuschner 2000). 
By contrast, the search queries in the large corpora were limited to WH-words for 
‘what’ and ‘who’ (incl. inflectional forms, if applicable, e.g. whom) for practical 
reasons, as was also the case in Bossuyt/De Cuypere/Leuschner (2018). 


2 https://www.ids-mannheim.de/cosmas2/projekt/referenz/archive.html (last accessed: 19-3-2019). 
3 https://portal.clarin.nl/node/4195 (last accessed: 19-3-2019). 

4 https://corpus.byu.edu/ (last accessed: 19-3-2019). 

5 http://research.flw.ugent.be/en/projects/convergentiecorpus (last accessed: 19-3-2019). 
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For the English data, search queries for whatever, whoever, whomever, and 
whosever® were conducted separately in each of the abovementioned corpora. A 
total of 4,642 instances were found. Search queries for these and other -ever-com- 
pounds conducted in the ConverGENTiecorpus resulted in a total of 1,240 exported 
instances. 

For the German data, Leuschner’s (2000) conclusions on the positional ten- 
dencies of immer and auch were taken into account when designing the corpus 
search queries, in order to maximize recall ratios. For immer, only instances 
where the W-word (i.e. was, wer, wem, or wen) immediately precedes immer were 
initially included. In a later stage, search strings with immer immediately preceded 
by a 3" person singular pronoun which was in turn immediately preceded by the 
W-word (e.g. wer es immer) were included to find rare occurrences in which immer 
follows the subject rather than preceding it. For wessen, which can modify nouns 
(e.g. wessen Haus ‘whose house’), the distance operator was set to 3. For auch, 
a distance operator of 4 was found to be the best balance between precision and 
recall (cf. Bossuyt/De Cuypere/Leuschner 2018: 101 fn. 5). A total of 53,732 instances 
were exported and analyzed manually (cf. 2.2 below). In the ConverGENTiecorpus, 
distance operators allowing up to 5 words between the W-word and the irrele- 
vance particle were included, guaranteeing that virtually every instance in the 
corpus was included in the sample. 

For the Dutch data, the queries were designed to resemble those used to 
search instances of auch in the DEREKo. This means that search queries allowed 
up to three words between the W-word (i.e. wat, wie or wiens) and the irrelevance 
particle ook. A total of 30,895 instances were exported and analyzed manually (cf. 
2.2 below). As with the German data, distance operators allowing up to 5 words 
were included in the search queries in the ConverGENTiecorpus, assuring that 
nearly every instance was included in the sample. 


2.2 Manual analysis of the German and Dutch data 


Whereas the results for English WH-ever are mostly unambiguous, the German 
and Dutch data needed manual analysis to check whether immer, auch resp. ook 
did indeed function as irrelevance particles. This is because immer can also be 


6 Unfortunately, the possessive form whoever’s could not be included in this study, since this 
search query resulted in too many invalid instances consisting of whoever followed by the con- 
tracted form of is. Adding the noun tag did not solve this problem, nor did tagging whoever’s as 
a possessive determiner. 
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a temporal adverb (cf. above), as shown in (4), and auch/ook can also be focus 
particles, as shown in (5) and (6): 


(4) #Was immer bleiben wird, ist mein Code civil. (Die Zeit (Online-Ausgabe), 
25-2-2010) 
‘What will always remain, is my Code civil.’ 


(5) #Was es heute jedoch auch haufiger gibt, sind Miitter, die arbeiten. 
(Braunschweiger Zeitung, 12-9-2008) 
‘However, what is nowadays more common as well are working mothers.’ 


(6) | #Wat ook speciaal zal zijn, is het Japanse theehuisje van S. D. 
(WR-P-P-G-0000666221) 
‘What will be special, too, is S. D.’s Japanese tea cottage.’ 


Moreover, numerous doubles had to be removed from the DEREKo and SoNaR data. 
This brought the final DEREKo sample to 23,299 instances (also used in Bossuyt/ 
De Cuypere/Leuschner 2018) and the final SoNaR sample to 9,305 instances. The 
ConverGENTiecorpus contains 91 instances for German and 171 for Dutch. 

Not all of these instances represent prototypical UCCs as mentioned in (1) and 
(3). The German sample in particular contains a considerable amount of non- 
specific free relatives (henceforth: NFRs), as in (7): 


(7) Wer immer bisher als “kiinftiger Papst” ins Konklave ging, kam als Kardinal 
wieder heraus. (Nürnberger Nachrichten, 14-10-2003) 
‘Whoever entered the conclave as a “future pope” so far, came out again as 
a (mere) cardinal.’ 


The free-choice semantics and quantificational strategies in these subordinate 
clauses are the roughly same as in UCCs, but the syntactic function of the subordi- 
nate clause in the complex sentence is different: whereas UCCs typically function 
as a loose adjunct to their apodosis, a NFR typically functions as an embedded 
argument in its respective main clause (Leuschner 2005), e.g. as its subject in (7), 
with a broad transitional zone of surface variation linking the two sentence types 
(Leuschner 2005: 59-62; Breindl 2014: 981f.). For the present study, however, the 
relevant syntactic distinctions are less important than the semantic-functional 
overlap between UCCs and NFRs, as shown by the fact that both clause types can 
be paraphrased by an open conditional (cf. Lehmann 1984: 339): 


(7 If x went into the conclave as a “future pope”, x came out again as a 
cardinal. 
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It is the presence of a variable in the underlying conditional relationship that 
motivates the quantificational strategies that are shared by UCCs and NFRs. 
Details in the surface realization of irrelevance marking may well vary with the 
syntactic status of the subordinate clause and such potential patterns should be 
addressed in future research into irrelevance marking. Only the overall patterns 
of irrelevance marking are in the focus of the present study, however, and hence 
no systematic distinction will henceforth be drawn between UCCs and NFRs. This 
decision is reflected in the label primary irrelevance constructions for both clause 
types together (as opposed to secondary irrelevance constructions, which occur 
atthe sub-clausal level, cf. below). 

Whereas there is only one strategy to mark free-choice quantification with an 
irrelevance particle in English primary constructions, namely by attaching -ever 
to the WH-word,’ the same quantificational effect is conveyed by different particles 
resp. particle combinations in different positions in German and Dutch. In order 
to account for this variation, Dutch and German primary constructions are ana- 
lyzed using Leuschner’s (2000) adaptation of the Topological Field Model (cf. 
Wollstein 2014) as demonstrated in Table 1. 


Table 1: Leuschner’s (2000: 345) adaptation of the Topological Field Model for primary 
irrelevance constructions in which the W-word is not the subject of the subordinate clause, 
exemplified by (1b) 


- pre- left middle-field right post- 
field bracket bracket field 

- Ww - ll S IV Vv - 

(1b) was - immer er auch sagt - 


While the W-word occupies the pre-field, leaving the left bracket unoccupied in 
Standard German (Wöllstein 2014: 32-37), the middle-field is divided into a field S 
for the subject of the subordinate clause and two fields which may be occupied by 
irrelevance particles: field II to the left of S and field IV to the right of S (Leuschner 


7 The WH-so-ever-pattern (e.g. whosoever, whatsoever) is unproductive and archaic in present- 
day English. The only exception is whatsoever as a post-nominal NPI, e.g. no idea whatsoever 
‘no idea at all’. Its intensifying meaning is, however, considerably different from the free- 
choice quantificational readings the present study is concerned with, and will therefore not 
be considered any further. 
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2000: 345). As usual in German subordinate clauses, the verb occupies the right 
bracket (V) and the post-field is standardly left unoccupied. 

The topological model of Table 1 only makes sense if the W-word is not the 
subject of the subordinate clause. If the W-word is the subject, on the other hand, 
there is no need to split up the middle-field, which is then simply called II/IV 
(Leuschner 2000: 345f.). 


Table 2: Leuschner’s (2000: 346) adaptation of the Topological Field Model for primary 
constructions in which the W-word is also the subject of the subordinate clause, exemplified by 
(8), taken from the SoNaR corpus 


- pre- left middle-field right post- 
field bracket bracket field 

- WwW - II/IV V - 

(8) wie - morgen ook wint - 


While these two models fit the majority of the data, a considerable amount of 
instances containing irrelevance marking does not fit either model (93/171 = 54.39% 
in the Dutch component of the ConverGENTiecorpus, 3,921/9,305 = 42.14% of 
SoNaR data; 35/91 = 38.46% in the German component of the ConverGENTiecorpus, 
4,926/23,299 = 21.14% of DEREKo data). These instances are derived historically 
from primary constructions, but have been reduced by ellipsis (Breindl 2014: 980f.; 
Leuschner 2013: 57; Waßner 2006: 386f.). They are labeled secondary irrelevance 
constructions in the present study and may function as: 


(9) general extenders (Overstreet 1999: 122-124, 147; Brinton 2017: 273-278) 

a. Zij worden nooit voor dief of wat dan ook uitgescholden. 
(WR-P-P-G-0000427484) 
‘They are never called thieves or whatever.’ 

b. Ich war immer betrunken, stoned oder was auch immer. 
(Braunschweiger Zeitung, 1-7-2010) 
‘I was always drunk, stoned, or whatever’. 

c. [...] there may be a gunboat, or whatever - I do not know. (Hansard90) 


(10) discourse markers (Brinton 2017: 268-282 on English whatever) 
a. [...] maar wat dan ook, jij bent de mooiste. (WR-U-E-A-0000104003) 
‘but whatever, you are the most beautiful.’ 
b. Doch was auch immer: Ein Crash ist trotzdem jederzeit möglich. 
(Die Südostschweiz, 22-10-2006) 
‘But whatever: a crash is nevertheless a possibility at all times.’ 
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c. [...] wed just talk about, I don’t know [pause] whatever, she’d probably 
agree with everything I said as well because that’s what Catherine’s like 
(BNC KP4 S_conv) 


(11) indefinite pronouns (cf. Haspelmath 1997: 139, 160f.) 

a. De beklimming van de Everest is voor wie dan ook superzwaar. 
(WS-U-E-A-0000000442) 
‘Climbing the Everest is super tough for anyone (lit. whoever).’ 

b. Ein Appell an wen auch immer, der sich verantwortlich fiihlt. 
(Süddeutsche Zeitung, 17-7-2008) 
‘A call to anyone (lit. whoever) who feels responsible.’ 

c. Romney can run a great campaign, spend untold millions in the final 
days, do whatever, but it’s still the president who has more agency here. 
(EN_Jou_Com_0077) 


Indefinite pronouns of the type WH + particle(s) in (11) are more common in Dutch (cf. 
Hoeksema 2012 on W dan ook-pronouns); some native speakers of German and Eng- 
lish might even not accept (11b) resp. (11c) as grammatical. In English, indefinite pro- 
nouns from the any-series are usually used in these contexts (cf. Haspelmath 1997). 

Since irrelevance particles show strikingly different distributional patterns in pri- 
mary and secondary irrelevance constructions in German and Dutch, a clear distinc- 
tion between primary and secondary irrelevance constructions will be made in the 
following sections. 


3 Distributional patterns 


3.1 Primary irrelevance constructions 


3.1.1 Dutch 


Table 3 represents the distribution of the Dutch irrelevance particle ook in primary 
irrelevance constructions in the ConverGENTiecorpus.® An example of each type 
from the corpus is given in (12).? 


8 Note that the left bracket and the post-field are left out of this and subsequent tables, as they 
are irrelevant to the particles’ distribution. 

9 Cases with a copula as in (12a) and (12b) were analyzed as W II S IV V-patterns, since the finite 
verb agrees in number with the NP, not the W-word: Wie hij.SG ook mocht.SG zijn ‘Whoever he 
might have been’, but Wie zij. PL ook mochten.PL zijn ‘Whoever they might have been’. 
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Table 3: Distribution of irrelevance particles in Dutch primary constructions with W # S in the 
ConverGENTiecorpus. # stands for raw frequencies, % for relative frequencies 


= WwW II S IV V # % 
a. Ww ook S = V 3 3.90% 
WwW = S ook Vv 74 96.10% 

77 100.00% 


(12) a. Watookhet statuut van het kind in kwestie is, [...]: elk kind heeft recht op 
huisvesting, onderwijs, gezondheidszorg, ... (NE_Jou_Com_1047) 
‘Whatever the status of the child in question is: every child has a right 
to housing, education and health care.’ 
b. Maar wie hij ook mocht zijn of geweest was, hij was dood. 
(NE_Jou_Com_ 1137) 
‘But whoever he may have been or had been, he is dead.’ 


Ook clearly occurs much more often in field IV (96.10%) than in field II (3.90%). 
This rightward tendency is confirmed by the data from the much larger SoNaR 
corpus (4,808/4,977 = 96.60% in field IV vs. 169/4,977 = 3.40% in field II), as 
shown in Table 4. Apart from ook, the much rarer particle combination dan ook 
occurs in primary irrelevance constructions and shares ook’s preference for field 
IV (132/136 = 97.06% in field IV vs. 4/136 = 2.94% in field II). (13) provides an 
example of each type from the corpus. 


Table 4: Distribution of irrelevance particles in Dutch primary constructions with W # Sin the 
SoNaR corpus 


- WwW Il S IV Vv # % 
a Ww ook S - V 169 3.31% 
b WwW - S ook Vv 4,808 94.03% 
c Ww danook S - V 4 0.08% 
d Ww - S danook V 132 2.58% 


5,113 100.00% 


(13) a. Wat ook de directe oorzaak mag zijn waardoor het vredesproces is 
vastgelopen, het lijdt geen twijfel dat hervatting van een dialoog de 
spanning tot normale proporties kan terugbrengen. 
(WR-P-P-I-0000000313) 
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‘Whatever may be the direct cause which got the peace negotiations 
bogged down, there is no doubt that resuming the dialogue will bring 
the tensions back to normal proportions.’ 

b. De beste ploeg zal winnen. Wie dat ook is, ik zal altijd een fles cham- 
pagne opentrekken. (WR-P-P-G-0000642881) 
‘The best team will win. Whoever that is, I will pop a bottle of cham- 
pagne in any case.’ 

c. Wat dan ook de oorzaak is, leg de zieke met de voeten omhoog en zorg 
dat hij voldoende lucht krijgt. (WR-P-P-H-0000061428) 
‘Whatever the cause is, lay down the sick person with their feet up and 
make sure they get enough air.’ 

d. [...]om de aandacht te trekken van de geïnteresseerden, wie dat dan ook 
mogen zijn. (WR-P-P-G-0000265835) 
‘to draw the attention of those who are interested, whoever that may 
be.’ 


Table 5 represents the distribution of Dutch irrelevance particle(s) (dan) ook, based 
on the SoNaR corpus, in primary constructions in which the W-word is also the 
subject of the subordinate clause. (14) gives an example of each type from the 
corpus.”° 


Table 5: Distribution of irrelevance particles in Dutch primary constructions with W = S in the 
SoNaR corpus 


- WwW II/IV Vv # % 
a. Ww ook Vv 257 94.83% 
b. Ww dan ook Vv 14 5.17% 

271 100.00% 


(14) a. Wat hier ook wordt besloten, ik ben ervan overtuigd dat we een 
onomkeerbaar proces in gang zetten waardoor heel Europa een geheel 
ander aanzien zal krijgen. (WR-P-P-I-0000000272) 


10 There are no instances of primary constructions in which the W-word is the subject of the 
subordinate clause in the ConverGENTiecorpus sample, although this is partially due to the fact 
that instances with Dutch er, e.g. wat er ook gebeurt ‘whatever happens’, were classified as 
instances of the WII S IV V-pattern (cf. Table 3). 
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‘Whatever is decided here, I am convinced that we set in motion an 
irreversible process which will totally alter the face of Europe.’ 

b. Wie dan ook mij deze ketting gaf, moet van mij gehouden hebben. 
(WR-P-E-G-0000010823) 
‘Whoever gave me this necklace, must have loved me.’ 


Despite ook clearly being more frequent in both types of primary irrelevance con- 
structions, a chi-square test suggests that dan ook is significantly overrepresented 
in W IM/IV V-constructions as shown in Table 6 (Yates x? = 5.08; df = 1; p = 0.02; 
Cramér’s V = 0.33): 


Table 6: Chi-square test comparing occurrences of Dutch irrelevance particles ook and dan ook 
in both types of primary irrelevance constructions in the SoNaR corpus. Standardized residuals 
are given in brackets, values higher than |2| indicate a significant deviation from the expected 
cell value and are in bold. No cells have an expected value below 5 


- ook dan ook total 

WIISIVV 4,977 (+0.08) 136 (-0.5) 5,113 
WII/IVV 257 (-0.37) 14 (+2.17) 271 
total 5,234 150 5,384 


3.1.2 German 


Table 7 shows the distribution of the German irrelevance particles immer and auch 
and their combinations, based on the ConverGENTiecorpus, in primary construc- 
tions in which the W-word is not the subject. An example of each type is provided 
in (15). 


Table 7: Distribution of irrelevance particles in German primary constructions with W # S in the 
ConverGENTiecorpus 


- WwW Il S IV Vv # % 
a WwW auch immer S - V 13 26.53% 
b Ww immer S - V 22 44.90% 
c Ww immer S auch Vv 8 16.33% 
d Ww - S auch Vv 6 12.24% 


49 100.00% 
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(15) 


a. 


Wie auch immer man Neanderthaler sehen mag. Das extrem wech- 
selhafte Bild spiegelte immer auch den Zeitgeist der jeweiligen Epo- 
che wieder. (GE_Sci_Pop_0464) 

‘However one may view Neanderthals, their extremely variable 
image always reflected the Zeitgeist of the respective period.’ 

Wann immer ein Land in die Krise gerät, werden seine Bürger panisch 
die Konten räumen. (GE_Jou_Com_0767) 

‘Whenever a country plunges into a crisis, its citizens will empty 
their bank accounts in panic.’ 

Was immer er auch jetzt sagen könnte, er müßte sich festlegen. 
(GE_Lit_ Fic_0005) 

‘Whatever he could say now, he would have to make a decision.’ 

Wo Forscher auch hinsehen, überall entdecken sie bisher unbe- 
kannte Arten. (GE_Sci_Pop_0630) 

‘Wherever scientists look, they discover previously unknown species 
everywhere.’ 


As can be seen from Table 7, the preferred position of irrelevance particles in 
German is clearly field II (71.43%) rather than field IV (12.24%). In fact, there are 
more instances where both fields are occupied (= type c; 16.33%) than instances 
where field IV is the only occupied field. The only particle that prefers field IV 
is auch, similarly to Dutch ook (cf. above). 


These general distributional tendencies are confirmed in the much larger 


sample from the DEREKO, as represented in Table 8. (16) provides an example of 
each type from the corpus. 


(16) 


Was auch die Gründe sein mögen, nur jammern [...] hilft auch nicht 
weiter. (St. Galler Tagblatt, 2-10-2001) 

‘Whatever the reasons may be, just complaining won’t help either.’ 
Wen auch immer man fragt: Esel finden alle irgendwie klasse. 
(Süddeutsche Zeitung, 3-6-2006) 

‘Whoever you ask: everyone thinks donkeys are great somehow.’ 
Wer immer auch die Täter sind, [...], sie müssen sich vorsehen. 

(Die Südostschweiz, 21-4-2010) 

‘Whoever the perpetrators are, they have to watch out.’ 

Was immer sie tun, Maitressen haben einen schlechten Ruf. 
(Süddeutsche Zeitung, 15-4-2014) 

‘Whatever they do, mistresses have a bad reputation.’ 
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Doch was immer er auch tut, es reicht nicht. (die tageszeitung, 19-11-2013) 
‘But whatever he does, it is not enough.’ 

Mit wem ich auch rede, überall höre ich dasselbe. (plenary minutes, 
Berlin, 28-6-2001) 

‘Whoever I talk to, I hear the same everywhere.’ 

Wessen Socke das auch immer ist, es wird langsam langweilig. 
(Wikipedia Discussion Forums, 2011) 

‘Whoever’s sock that is, things are beginning to get boring.’ 
Zeitgemäße Dienstvereinbarungen, was das immer auch heißen möge. 
(plenary minutes, Sankt Pölten, 4-10-2001) 

‘Contemporary service contracts, whatever that may be.’ 

zu AC. @Hajog oder O. oder wer das immer ist. (Wikipedia Discussion 
Forums, 2011) 

‘to AC. @Hajog or O. or whoever that is.’ 


Table 8: Distribution of irrelevance particles in German primary constructions with W # S in the 


DEREKo 


- ra moans 


SSzSZ222222/=2 


Il S IV Vv # % 
auch S - V 22 0.24% 
auch immer S - V 954 10.53% 
immer auch S - V 149 1.64% 
immer S - V 6,075 67.05% 
immer S auch Vv 1,005 11.09% 
S auch Vv 647 7.14% 
S auch immer Vv 154 1.70% 
S immer auch V 15 0.17% 
S immer Vv 39 0.43% 


9,060 100.00% 


The types represented in the ConverGENTiecorpus (cf. (15a-d) above) are precisely 
the four most frequent ones in the DEREKO, viz. immer occupying field II (67.05% 
in the DEREKO), immer ... auch straddling the subject field (11.09%), auch immer 
occupying field II (10.53%), and auch occupying field IV (7.14%). All other types, 
which account for less than 2% each and for only about 4.18% combined, are 
instances of the particles (or particle combinations) occupying their respective 
dispreferred field(s). Moreover, the basic tendency is confirmed that irrelevance 
marking in field II only (79.47%) is preferred over marking in both fields simulta- 
neously (11.09%) or in field IV only (9.44%). 
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The most striking difference, however, is the proportion of immer in field II in 
the DEREKo (67.05%) if compared to both the ConverGENTiecorpus (44.90%) and 
Leuschner’s study based on the Mannheimer Korpus (34/92 instances = 36.96%, 
Leuschner 2000: 348). A one-tailed two-proportions Z-test suggests that the propor- 
tion in the DEREKO deviates significantly from the corresponding proportions in the 
ConverGENTiecorpus and the Mannheimer Korpus (p < 0.0001 in both), while the 
ConverGENTiecorpus and Mannheimer Korpus do not deviate significantly from 
each other (p = 0.18)."! There are several potential explanations for this difference: 
1. Whereas Leuschner’s (2000) sample and the ConverGENTiecorpus contain 

search results for virtually all W-words, the DEREKo sample is limited to was 

and wer (incl. inflectional forms; cf. above). This means that almost all W- 

words that can form complex W-phrases, such as welch- (welches Haus ‘which 

house’) or wie (wie schön ‘how beautiful’), are excluded from the DEREKo 
sample. In fact, the only W-word in the DEREKo sample that can build complex 
phrases is wessen, which is by far the least frequent W-word in the sample 

(n = 252 or 1.08% of the total DEREKo sample). On the other hand, welch- and 

wie are the two most frequent W-words in the ConverGENTiecorpus sample, 

making up 49% of its instances. 

Since immer is only very rarely attested with complex W-phrases (cf. fur- 
ther below), but occurs very frequently with simple W-words such as was and 
wer, the difference in W-word coverage between the DEREKO on the one hand 
and the ConverGENTiecorpus and Mannheimer Korpus on the other hand may 
largely explain the proportional differences between these corpora. At a later 
stage of the investigation, welch- will be added to the DEREKo sample, pre- 
sumably resulting in an overall lower proportion of immer. 

2. The distance operator of 1 in DEREKO search queries for W immer may have 
caused immer to be somewhat overrepresented in this sample. Since larger 
distance operators make the recall ratios less precise, it is easier to find in- 
stances of W immer compared to e.g. W ... auch with a distance operator of 4. 

3. Tendencies relating to text genre may play a role here. The relative portion of 
written press texts in the DEREKo is much larger than in the more balanced 
ConverGENTiecorpus and in the Mannheimer Korpus, which contained a larger 


11 The proportional difference between the DEREKO on the one hand and the ConverGENTiecorpus 
and Mannheimer Korpus on the other hand remains significant after a Bonferroni correction was 
carried out, which is used to counteract the increased risk of false positives when comparing 
more than two samples with a two-proportions Z-test. I thank Dr. Ludovic De Cuypere (Ghent) for 
introducing me to this method. Although Z-tests require independent data and the Mannheimer 
Korpus is included in the DEREKo, the enormous size difference between these two corpora 
(cf. above) nullifies this issue. 
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proportion of literary texts. To test this hypothesis, the proportions of immer 
in press texts, parliamentary minutes, and Wikipedia-texts were compared in 
two randomly drawn subsets containing 10% of the instances in Table 8 (n = 
906) for constructions with W # S and in Table 9 (n = 931, cf. below) for con- 
structions with W = S.” While text genre is often a large source of unwanted 
noise in corpus linguistics, its role seems to be fairly minor in this case: only 
the difference between press texts and Wikipedia-texts in W II/IV V-construc- 
tions proved to be significant (two-tailed two-proportions Z-test p < 0.0001). 
While these findings suggest that the proportion of immer is not strikingly 
different across different text genres, it may still be worthwhile for further 
research to look into the effects of text genre on particle distribution, based 
on multiple genres and larger samples. 

It is conceivable that the proportional differences between Leuschner’s (2000) 
sample based on the Mannheimer Korpus, which was compiled in the 1960s, 
and the DEREKo sample, which consists mostly of texts from the 1990s-2010s, 
reflect a microdiachronic change. This is, however, rather unlikely, since the 
ConverGENTiecorpus consists of texts published from the 1990s until 2015, 
and yet shows a distribution similar to the Mannheimer Korpus. Another rea- 
son why the microdiachronic hypothesis is implausible, is that irrelevance 
particles in German are part of a larger “grammaticalization building-site” 
(Leuschner 2006; cf. Nübling 2005), and therefore unlikely to undergo dra- 
matic changes within a few decades (cf. further below). 


Table 9 represents particle distributions, based on the DEREKo data, in those 
primary irrelevance constructions in which the W-word is also the subject of the 
subordinate clause. An example of each type with the verb passieren ‘to happen’ 
is given in (17).¥ 


(17) 


a. Denn was auch passiert: Freilichtspiele sind immer ein Erlebnis. 
(Mannheimer Morgen, 16-6-2001) 
‘For whatever happens: open-air shows are always a great experience.’ 
b. Was auch immer passiert, es muss schnell geschehen. (Luxemburger 
Tageblatt, 28-6-2011) 
‘Whatever happens, it has to happen fast.’ 


12 Iam grateful to an anonymous reviewer for suggesting this method to me. 

13 The ConverGENTiecorpus contains only 3 instances of immer in W II/IV V-constructions 
(42.86%), 2 with auch immer (28.57%), and 2 with auch (28.57%). Since the total number of 
occurrences is so low (n = 7), little can be said about these instances and they will not be dis- 
cussed any further. 
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c. Was immer auch passiert, Gott will, daß wir glücklich sind. (Neue 
Kronen-Zeitung, 24-1-1995) 
‘Whatever happens, God wants us to be happy.’ 

d. Was immer passiert, wir sind bereit zu kämpfen. (St. Galler Tagblatt, 
15-2-1999) 
‘Whatever happens, we are prepared to fight.’ 


Table 9: Distribution of irrelevance particles in German primary constructions with W = S in the 
DEREKo 


- WwW IV Vv # % 
a Ww auch V 79 0.85% 
b Ww auch immer Vv 1,295 13.91% 
c WwW immer auch Vv 640 6.87% 
d Ww immer Vv 7,299 78.37% 

9,313 100.00% 


As with W II S IV V-constructions in Table 8, immer is the most frequent irrele- 
vance particle in Table 9 (78.37%), but auch immer (13.91%) is more common than 
immer (...) auch (6.87%) in W II/IV V-constructions. Auch occurs only marginally in 
the latter subordinate clause type (0.85%). In accordance to these observations, a 
chi-square test with standardized residuals, as shown in Table 10, suggests that 
immer and auch immer occur significantly more often in W II/IV V-constructions, 
whereas auch and immer (...) auch show a strong preference for the W II S IV 
V-constructions (x? = 735.97; df = 3; p < 0.0001; Cramér’s V = 0.20). 


Table 10: Chi-square test comparing occurrences of immer, auch, auch immer, and immer (...) 
auch in both types of German primary irrelevance constructions in the DEREKo. Standardized 
residuals are given in brackets, no cells have an expected value below 5 


- immer auch auch immer immer (...) auch total 

WIISIVV 6,114 669 1,108 1,169 9,060 
(-6.15) (415.63) (-2.24) (+9.27) 

WII/IVV 7,299 79 1,295 640 9,313 
(+6.07) (-15.41) (+2.2) (-9.15) 


total 13,413 748 2,403 1,809 18,373 
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3.2 Secondary irrelevance constructions 
3.2.1 Dutch 


Table 11 represents the distribution of the Dutch irrelevance particle(s) (dan) ook 
in secondary constructions in the ConverGENTiecorpus. Particle distributions in 
the SoNaR corpus are given in Table 12. Examples from the corpora are provided 
in (18) resp. (19). 


Table 11: Distribution of irrelevance particles in Dutch secondary constructions in the 
ConverGENTiecorpus 


- ook dan ook total 
# 38 55 93 


% 40.86% 59.14% 100.00% 


(18) a. Als een rode draad door zijn politiek loopt tenslotte zijn constante 
weigering om welk akkoord ook te sluiten (NE_Jou_New_0715) 
‘A central theme in his politics is after all his permanent refusal of 
signing any agreement (lit. whichever agreement)’ 

b. Alleen het lezen van deze letters in het Frans of welke andere taal dan 

ook leidt tot verbazingwekkende resultaten. (NE_Cor_Pro_0016) 
‘Simply reading these letters in French or in whichever other language 
leads to amazing results.’ 


Table 12: Distribution of irrelevance particles in Dutch secondary constructions in the 
SoNaR corpus 


- ook dan ook total 
# 975 2,946 3,921 


% 24.87% 75.13% 100.00% 


(19) a. Jij hoeft u daarover niet te schamen of wat ook. 
(WR-P-E-G-0000005399) 
‘You don’t have to be ashamed of that or whatever.’ 
b. Een fusie met wie dan ook is geen optie. (WR-P-P-G-0000599808) 
‘A fusion with anyone (lit. whoever) is not an option.’ 
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Although dan ook is clearly the more frequent option in both corpora, instances 
with ook still account for a considerable proportion of the total. They mainly 
occurin a specific context, however, namely with indefinite pronouns in compar- 
ative constructions. 673 out of 975 occurrences of ook in secondary constructions 
are comparatives (69.03%). 

In all 856 comparatives in SoNaR, we find a tendency to use the single particle 
ook (n = 673 or 78.62%) rather than the particle combination dan ook (n = 183 or 
21.38%): 


(20) ik weet meer dan wie ook over armoede (WR-P-E-A-0000410476) 
‘I know more than anyone (lit. whoever) about poverty’ 


According to Hoeksema (2012: 96), the reason for this tendency is that speakers 
want to avoid a “double dan” (i.e. horror aequi). Since the comparative particle in 
Dutch happens to be dan (Reinarz/de Vos/de Hoop 2016), speakers tend to prefer 
dan wie ook over dan wie dan ook. Moreover, comparative constructions tend to 
be used with animate pronouns (e.g. wie ‘who’ rather than wat ‘what’; Hoeksema 
2012: 98), and this could explain why the proportion of ook is significantly higher in 
secondary irrelevance constructions with wie, while dan ook shows a significant 
preference for inanimate wat (Yates x? = 601.88; df = 1; p < 0.0001; Cramér’s V = 
0.39). 


Table 13: Chi-square test comparing occurrences of Dutch irrelevance particle(s) (dan) ook 
secondary constructions in the SoNaR corpus. Standardized residuals are given in brackets, no 
cells have an expected value below 5 


- ook dan ook total 

wie 704 (+16.61) 822 (-9.56) 1,526 
wat 271 (-13.27) 2,121 (+7.64) 2,392 
total 975 2,943 3,918 


3.2.2 German 


The distribution of German irrelevance particles in secondary constructions in 
the ConverGENTiecorpus is shown in Table 14; Table 15 represents their distribu- 
tion in the DEREKO. Examples from the corpora are given in (21) resp. (22). 
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Table 14: Distribution of irrelevance particles in German secondary constructions in the 
ConverGENTiecorpus 


- immer auch immer auch total 
# 6 26 3 35 


% 17.14% 74.29% 8.57% 100.00% 


(21) a. Weitergabe des Mietgegenstandes an natürliche oder juristische Per- 
sonen in welcher Form immer ist dem Mieter untersagt. 
(GE_Ins_Con_0104) 

‘A transfer of the rental property to natural or legal persons in which- 
ever form is prohibited to the tenant.’ 

b. Feigheit, Faulheit, was auch immer. (GE_Jou_Com_1040) 

‘Cowardice, laziness, whatever.’ 

c. Sollte der Mieter, aus welchen Gründen auch, seinen Mietvertrag annul- 
lieren, erklärt er sich bereit, dem Vermieter Schadenersatz zu erstatten. 
(GE_Ins_Con_0027) 

‘Should the tenant, for whichever reasons, cancel their rental contract, 
they agree to pay the landlord a compensation.’ 


Table 15: Distribution of irrelevance particles in German secondary constructions in the DEREKo 


- immer immer auch auchimmer auch total 


# 399 18 4,485 24 4,926 


% 8.10% 0.37% 91.05% 0.49% 100.00% 


(22) a. Zum Einstieg, zum Verführen, als kleine Zwischenmahlzeit, als was 

immer: Tapas müssen auf den Tisch. (Nürnberger Nachrichten, 8-3-1999) 
‘As a Starter, as a temptation, as a small snack in between, as whatever: 
there have to be tapas on the table.’ 

b. Aber wer könnte ein Interesse daran haben, Ihnen was immer auch 
zuzufiigen? (Emme, Pierre: Florentinerpakt, 25-3-2011) 
‘But who could benefit from inflicting anything (lit. whatever) upon 
you?’ 

c. Ich bin wichtig. Ich bin... was auch immer. (Braunschweiger Zeitung, 
23-10-2010) 
‘Tam important. I am ... whatever.’ 
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d. Ob Baldi, Plüss oder wer auch sonst: Bern braucht vor allem eines: 
Den Mut, mit der Vergangenheit zu brechen. (Zürcher Tagesanzeiger, 
30-11-1998) 

‘Whether Baldi, Plüss or whoever else: Bern needs one thing above 
all: the courage to leave the past behind.’ 


Whereas immer is the most frequent particle in primary irrelevance construc- 
tions, it plays only a minor role in secondary constructions. Instead, the latter are 
clearly dominated by auch immer, the only particle (or particle combination) in the 
DEREKO sample that prefers secondary over primary constructions (4,485 second- 
ary constructions out of 6,889 total instances = 65.10%). By contrast, all the other 
particles and particle combinations clearly prefer primary constructions (immer: 
399 instances are secondary constructions out of 13,812 total instances = 2.89%; 
immer auch: 18/1,827 = 0.99%; auch: 24/772 = 3.11%). This holds especially for the 
other particle combination, immer auch, which does not occur in secondary con- 
structions in the ConverGENTiecorpus at all. 


4 Differences and similarities 


Now that the distributional patterns of irrelevance particles in different construc- 
tion types have been described in section 3, the most striking differences and 
similarities between certain particles or particle combinations will be discussed 
below. 


4.1 German auch and Dutch ook 


As mentioned above, the etymologically identical irrelevance particles auch and 
ook share their overwhelming rightward tendency. In fact, the distributional ten- 
dencies of auch and ook are strikingly similar (auch occupies field IV in 647 out of 
669 W II S IV V-instances in the DEREKO = 96.71%; 4,808/4,977 in the SoNaR cor- 
pus = 96.60%). Their distributional patterns are statistically identical both in the 
language-specific corpora and in the subcomponents of the ConverGENTiecorpus 
(X < 0.001, n = 5,646, df = 1, p > 0.99 for the DEREKo and SoNaR corpus; Fisher’s 
Exact Test: p > 0.99 for the ConverGENTiecorpus). This rightward tendency has 
been explained in terms of disambiguation: according to Leuschner (2000: 354), 
auch is more likely to be misinterpreted as a narrow-scope focus particle in field II 
and more likely to be read as a wide-scope irrelevance particle in field IV (cf. also 
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Bossuyt/De Cuypere/Leuschner 2018: 110). The same explanation applies to the 
Dutch particle ook. 

As has been observed before (Leuschner 2000: 350), occupation of field II by 
auch (and ook) is not only much rarer, but also more restricted, as auch and ook 
can only occur, at least as irrelevance particles, before lexical subjects, not before 
pronouns: 


(23) a. Was auch die Abgeordneten des Bundestags entscheiden - das letzte 
Wort hat immer wieder das Bundesverfassungsgericht. 
(Nürnberger Zeitung, 21-12-2012) 
‘Whatever the delegates of the Bundestag decide - the Federal Con- 
stitutional Court always has the last word.’ 

b. Wie ook zijn medewerkers waren in de regering, bij de Europese Com- 
missie of het Europees Parlement, allen bewaren ze goede herinnerin- 
gen aan hun vroegere ‘baas’. (NE_Lit_Non_1208) 

‘Whoever his fellow workers were in the government, at the European 
Commission or the European Parliament, all had good memories to 
their former ‘boss’.’ 


(24) a. *Was auch die entscheiden |...] 
‘Whatever they decide ...’ 
b. *Wie ook zij waren |....] 
‘Whoever they wete ...’ 


This positional restriction can be explained by the general tendency of German and 
Dutch lexical subjects to occupy their base position in [Spec, VP] (Lenerz 1993: 118), 
i.e. the right periphery of the middle-field, occasionally forcing auch/ook to occupy 
field II despite the above-mentioned risk of ambiguity. According to Behaghel’s 
(1909) “Law of Increasing Constituents” and the principle of end-weight, the pref- 
erence to occur further to the right is especially strong with lengthier lexical sub- 
jects. Conversely, German and Dutch pronouns generally prefer to occupy the left 
periphery of the middle-field, also known as the “Wackernagel position” (Weiß 
2018). Since pronouns are typically thematic, expressing discourse-old, given 
information, they tend to occur before rhematic, i.e. discourse-new information, 
which is typically expressed through lexical word classes such as NPs (cf. Noel 
Aziz Hanna 2015: 46). Auch thus never precedes pronouns because its positional 
preferences are perfectly complementary to those of pronouns. 
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4.2 German auch immer and Dutch dan ook 


The particle combinations auch immer and dan ook share four notable similarities. 
The first is that they are the most frequent option in secondary irrelevance con- 
structions in their respective language, as seen in section 3.2. Although auch immer 
may seem to be more dominant in German secondary constructions (26 out of 35 
instances in the ConverGENTiecorpus = 74.29%; 4,485/4,926 in the DEREKo = 
91.05%) compared to Dutch dan ook (55/93 in the ConverGENTiecorpus = 59.14%; 
2,946/3,921 in the SoNaR corpus = 75.13%), this difference is only significant in 
the language-specific corpora (two-tailed two-proportions Z-test p < 0.0001), not 
in the ConverGENTiecorpus (p = 0.11). It can be explained by the fact that W auch/ 
immer-pronouns in comparative constructions do not occur in the German sample 
at all, while being very frequent in the Dutch sample. As mentioned above (cf. 
section 3.2.1), it is in this exact context that Dutch secondary irrelevance con- 
structions show a tendency to take the single particle ook rather than the particle 
combination dan ook. 

Consistent with this similarity, both auch immer and dan ook are specialized 
for secondary irrelevance constructions: if all instances of auch immer and dan ook 
are considered, a clear majority of them turn out to be secondary constructions. 
Dan ook seems to specialize even more for secondary constructions than auch 
immer: all 55 instances of dan ook in the ConverGENTiecorpus are secondary con- 
structions, compared to only 26 out of 41 instances with auch immer (63.41%). A 
similar pattern is found in the language-specific corpora (2,946/3,096 in the SoNaR 
corpus = 95.16% vs. 4,485/6,889 = 65.10% in the DEREKO; two-tailed two-propor- 
tions Z-test: p < 0.0001)."* Thus, whereas German auch immer occurs both in sec- 
ondary irrelevance constructions (where it clearly predominates) and primary con- 
structions, Dutch dan ook is almost exclusively found in secondary constructions. 

The third similarity of auch immer and dan ook is that these particle combina- 
tions are never broken up by any other constituent, i.e. that the components auch 
and immer resp. dan and ook always occur next to each other. Using terminology 
suggested by Thurmair (1989: 290) for modal particles, auch immer and dan ook 
thus qualifiy as “closed” particle combinations. This suggests that these erstwhile 
particle combinations have been reanalyzed as single complex particles, ena- 
bling them to function as “indefiniteness markers” to the W-stem (in the termi- 
nology of Haspelmath 1997) in secondary irrelevance constructions. 


14 The two-proportions Z-test cannot be performed on the data from the ConverGENTiecorpus 
because the difference between the numerator and denominator is < 5 for the Dutch data (55/55 = 
100.00%). 
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The fourth similarity is the statistically significant preference of auch immer 
and dan ook for primary constructions of the W II/IV V-type over the W II S IV 
V-type (cf. above, sections 3.1.1 and 3.1.2). Given that W II/IV V-constructions 
have no subject field and therefore tend to be shorter than W II S IV V-construc- 
tions, the preference of auch immer and dan ook for shorter or elliptically reduced 
subordinate clauses (as observed by Leuschner 2000: 353 for auch immer) is not 
surprising. 

The most notable difference between auch immer and dan ook is the comple- 
mentary nature of their positional tendencies in primary irrelevance constructions. 
In WII SIV V-constructions, auch immer shows a strong leftward tendency, occu- 
pying field II 954 out of 1,108 total instances in the DEREKo (86.10%). Dan ook, 
on the other hand, shows a strong rightward tendency in this construction type 
(132/136 in field IV in the SoNaR corpus = 97.06%). This might seem like a prob- 
lem, as it has been argued that auch immer’s leftward tendency is one of the major 
factors that caused this particle combination to specialize for secondary construc- 
tions (Bossuyt 2016: 64): a W-word and one or more subsequent irrelevance par- 
ticles are more likely to be reanalyzed as a new unit if the particles typically occur 
in immediate adjacency to the W-word. This factor may well apply to German 
auch immer, but it is obviously irrelevant for dan ook, given that Dutch does not 
have any irrelevance particles with a leftward tendency to begin with. The reason 
that dan ook specializes for secondary constructions rather than ook alone is due 
to the fact that a complex particle is less prone to ambiguity as an indefiniteness 
marker than a single particle. This is especially true in clause-medial contexts, in 
which secondary constructions often occur. For the very same reason, the complex 
particle auch immer is more frequent in German secondary constructions than 
immer, which also has a preference for field II, but is a single particle instead of 
a closed particle combination (ibid.). 


4.3 German immer and English -ever 


Not only is German immer related to English -ever etymologically to the extent 
that the initial i- in immer is cognate with the e- in English ever (Leuschner 1996), 
its leftward tendency is reminiscent of the positional shift undergone by ever in 
the history of English. In present-day English, attaching itself to the WH-word is the 
only option for -ever (cf. above). For immer, it is almost the only option: immer 
occupies field II in 6,075 out of 6,114 W II S IV V-instances in the DEREKo sample 
(99.36%). Although immer competes for this position with pronominal subjects 
(cf. section 4.1 above), the pronoun has successfully forced immer to occupy field 
IV in only 39 instances in the entire DEREKO sample. Since only pronouns com- 
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pete for Wackernagel’s position, immer never occurs behind lexical subjects, as 
shown by (26) in comparison with the original in (25): 


(25) Und was es immer gewesen sein mag: Der Verdächtige ist nicht vorbestraft 
und erst recht nicht verurteilt. (Nürnberger Nachrichten, 16-5-1998) 
‘And whatever it may have been: the suspect has not been previously con- 
victed and surely never been sentenced.’ 


(26) * Was das Verbrechen immer gewesen sein mag [...] 
‘Whatever the crime may have been ...’ 


As shown by Leuschner (2001; 2006: 134-140), immer and ever first occurred in 
irrelevance constructions as free-choice adverbs supporting the quantificational 
effect of the semantically opaque irrelevance markers so ... so (e.g. Old English 
swa hwylc swa ‘whoever, whichever’; Old High German so wér so ‘whoever’). 
Immer and ever then began replacing so ... so as the main irrelevance marking 
strategy, a grammaticalization process which was accompanied by the omission 
of the left-hand swa (> so) in English, eventually resulting in WH-(so)-ever-com- 
pounds, and the right-hand so (> s-) in German, resulting in combinations like 
swä iemer ‘wherever’. English -so- was eventually left out completely in irrelevance 
constructions (for whatsoever, which still occurs as a post-nominal intensifier, 
cf. above and Leuschner 2001), and German irrelevance marking s-W-words col- 
lapsed with bare W-words in the 14" century (Leuschner 2006: 135), leaving iemer 
(> immer) and auch as clause-internal irrelevance marking. Both ever and immer 
occurred initially in field IV, i.e. in the typical pofition of adverbf, but following 
their reanalysis as quantificational particles began shifting towards field II as 
so ... so became increasingly obsolete and the new strategies of irrelevance marking 
became more and more obligatory (cf. Leuschner 2006 and Bossuyt/De Cuypere/ 
Leuschner 2018 for more details). 

While immer and -ever both underwent grammaticalization, this process 
happened much faster in English than in German. The last instances with zure 
(> ever) in field IV seem to be attested around the 12" century: 


(27) Luue dine nexte al swa ðe seluen, hwat manne swa he zure bie! (cited in 
Leuschner 2006: 135) 
‘Love thy neighbour like thyself, whatever man he be!’ 


In German, however, the positional tendencies of immer and auch did not emerge 
clearly until well into the 19% century (Leuschner 2006: 136), as suggested by 
verses like (28) from Johann Wolfgang von Goethe (1749-1832): 
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(28) Und man kommt in’s Gered’, wie man sich immer stellt. (cited in Goethe’s 
Faust I, line 3201) 
‘And one becomes the subject of gossip, however one (lit.: how one ever) 
positions oneself.’ 


Unlike with -ever, the grammaticalization of immer is still incomplete. While 
phrases like whichever house and however beautiful are perfectly grammatical in 
English, their German equivalents with immer are ungrammatical or at least highly 
unusual: welches *(immer) Haus '(immer), wie *(immer) schön (immer). When 
wessen ‘whose’ modifies an intervening NP, as in (29a), immer is ruled out, but 
auch immer is allowed.” When wessen functions as a genitive object, by contrast, 
and no constituents intervene between the W-word and the particle as in (29b), 
immer is unproblematic: 


(29) a. mit wessen Geld auch immer [*immer] sie bezahlt wurden 
(St. Galler Tagblatt, 18-3-2010) 
‘with whoever’s money they got payed’ 
b. wessen immer man mich anklagt (Süddeutsche Zeitung, 31-3-1998) 
‘Whatever (some)one accuses me of’ 


Immer also seems to be problematic with complex W-words such as womit ‘with 
what/which’ (lit. ‘where-with’), as suggested by Leuschner (2000: 350). These 
restrictions have so far prevented immer from becoming the sole irrelevance par- 
ticle in German and attaining univerbation with the W-word, as has happened in 
English. Its obligatorification seems to be counteracted by the presence of other 
particles, as the above-mentioned restrictions are more likely to encourage the 
use of auch or particle combinations rather than immer alone in these specific 
contexts. 


5 Conclusion and prospects 


The present study has documented and analyzed the distributional patterns ofthe 
irrelevance particles -ever, immer and/or auch and (dan) ook in both primary and 
secondary irrelevance constructions. A contrastive corpus triangulating approach 


15 (29a) would be grammatical with auch immer in either field II or field IV, or, alternatively, 
with auch in field IV. In any case, immer in field II is ruled out. 
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was adopted, thereby expanding the scope of a previous study by Bossuyt/De 
Cuypere/Leuschner (2018) and providing a semi-replication of Leuschner (2000). 

From a diachronic perspective, the synchronic analysis can be read as a snap- 
shot of a long-term process of emergence-by-grammaticalization. As far as primary 
constructions are concerned, this is nearly completed in English, where -ever is 
the sole irrelevance particle and only occurs in univerbation with the WH-word. 
There are only a few small defects in the WH-ever-paradigm, like the *whyever 
gap (Leuschner 2006: 41) and residual -so- in intensifying whatsoever. In German, 
the grammaticalization process is not only incomplete, but seems to have lost 
its former directionality: although immer shows a very similar leftward tendency 
to -ever, it has not yet reached univerbation with the W-word and seems unlikely 
to do so in the foreseeable future because its obligatorification is hindered by 
the systemic presence of both auch and the particle combination auch immer. 
Thus, the W immer/auch-paradigm seems to be stuck in an uneasy balance. Dutch 
W ... ook shows only weak signs of grammaticalization: although ook did undergo 
function-specific semantic changes when it was recruited from the focus particle, 
and shows a clear preference for field IV (cf. Bossuyt 2016: 59 and Leuschner 
2013: 53 on German auch), its position in this field is not absolute and the result 
of its preferential position is precisely to make it discontinuous with the W-word. 
Ook thus fails to show even the most rudimentary signs of coalescence (Lehmann 
2015: 157-167), a clear indication that any further increase in grammaticalization 
is blocked. 

In secondary constructions, however, we see a different pattern. Dutch Wdan 
ook is highly specialized for secondary constructions and the most functionally 
versatile of all three languages, occurring as a discourse marker, general extender, 
and indefinite pronoun. German W auch immer occurs frequently in the first two 
functions, but is still rare as an indefinite pronoun. The same is true for English 
WH-ever in secondary constructions, mainly due to the systemic presence of the 
any-series. 

The subsystem of irrelevance marking through particles thus participates in 
the larger “grammaticalization building-site” of concessive conditionality in Eng- 
lish, German, and Dutch (Leuschner 2006). Follow-up research should look at 
the interaction, both in terms of quantification (i.e. semantics) and of surface 
distribution, between irrelevance particles on the one hand and expressions of 
epistemic modality, particularly may/mögen/mogen, and the present subjunctive 
as strategies of free-choice quantification on the other hand: 


(30) a. It might have something to do with people trying to express their frus- 
tration -- whatever that may be. 
(COCA, NEWS: Atlanta) 
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b. Steht am Schluss eines Artikels “pd”, hat nicht die Zeitungsredaktion 
geschrieben, sondern der “Pressedienst”, wer immer das sein möge. 
(St. Galler Tagblatt, 11-5-2012) 
‘If it says “pd” at the end of an article, then the newspaper editorial 
didn’t write [it], but the “press-service”, whoever that may be.’ 

c. Wat Henin ook moge beweren, zij start als favoriete. 
(WR-P-P-G-0000237815) 
‘Whatever Henin may claim, she starts as the favorite.’ 


Another avenue is to investigate the alteration between clause-internal strategies 
of irrelevance marking, i.e. through irrelevance particles, and clause-external strat- 
egies such as elliptical expressions of irrelevance, which come to the building-site 
with a grammaticalization history of their own (Leuschner 2006). As Bossuyt/De 
Cuypere/Leuschner (2018: 117) demonstrate, rare instances of overlap between 
clause-external and clause-internal strategies exist in German, and equivalent 
examples occur in Dutch: 


(31) a. Egal, was sie auch tun (die tageszeitung, 2-12-2006) 
‘No matter what (lit. whatever) they do’ 
b. Het is het nie[t] waard jong, gelijk met wie je ook zo een one-night-stand 
zou willen doen. (WR-P-E-A-0000047811) 
‘It’s not worth it, man, no matter with whom (lit. whomever) you would 
like to have a one night stand.’ 


The language-specific and cross-linguistic patterning of such an overlap remains to 
be seen. Equivalent structures in English would feature a WH-ever-word in combi- 
nation with no matter (or some other clause-external marker). The fact that there are 
no such examples in the BYU-sample at all matches the observation that the overlap 
occurs in German mainly with the rightward-tending auch, but only rarely with the 
leftward-tending immer. Any future investigation taking into account irrelevance 
markers other than clause-internal particles is thus likely to confirm the position 
of German irrelevance marking strategies in between those of Dutch and English. 


Corpora 


Davies, Mark: BYU-BNC. (Based on the British National Corpus from Oxford University Press). 
Available online at https: //corpus.byu.edu/bnc/. 

Davies, Mark: The Corpus of Contemporary American English (COCA): 560 million words, 
1990-present. Available online at https://corpus.byu.edu/coca/. 
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Davies, Mark: The Strathy Corpus: 50 million words of Canadian English. Available online at 
https://corpus.byu.edu/can. 

Davies, Mark: Hansard Corpus: part ofthe SAMUELS project. Available online at www. 
hansard-corpus.org/. 

Davies, Mark: The Wikipedia Corpus: 4.6 million articles, 1.9 billion words. Adapted from 
Wikipedia. Available online at https://corpus.byu.edu/wiki/. 

Instituut voor de Nederlandse taal: OpenSoNak. Available online at http://opensonar.inl.nl/. 

Leibniz-Institut fiir Deutsche Sprache, Mannheim: Das Deutsche Referenzkorpus DEREKo. 
Available online at www.ids-mannheim.de/kl/projekte/korpora/. 

Lauwers, Peter/Plevoets, Koen: ConverGENTiecorpus. Institutionally available at Ghent University. 
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Peter Dirix/Liesbeth Augustinus/Frank Van Eynde (Leuven) 
IPP in Afrikaans: a corpus-based 
investigation and a comparison with 
Dutch and German 


Abstract: In the West-Germanic languages we expect an auxiliary ofthe perfect to 
select a past participle. In a subset ofthese languages, however, some verbs select 
an infinitive instead, i.e. in constructions known as infinitivus pro participio (IPP). 
The phenomenon is well-studied with regard to Dutch and German, but for Afri- 
kaans an extensive study based on empirical data is still lacking. In order to fill 
this void, the present paper uses a corpus study to identify the verbs which - 
obligatorily or optionally - take the IPP form in Afrikaans. Verb classes showing 
the IPP effect in Afrikaans, Dutch and German are compared, and crosslinguistic 
similarities and differences are identified. The result is a corpus-based typology 
of IPP verbs in the three languages in question. 


Zusammenfassung: In den westgermanischen Sprachen ware eigentlich zu erwar- 
ten, dass Perfekt-Hilfsverben immer ein Partizip Perfekt selektieren. In einer Unter- 
gruppe dieser Sprachen selektieren einige Verben jedoch einen Infinitiv, den so 
genannten infinitivus pro participio (IPP). Wahrend dieses Phanomen hinsicht- 
lich des Niederlandischen und Deutschen bereits eingehend erforscht worden ist, 
fehlt zum Afrikaans bisher eine umfangreichere, empirisch fundierte Studie. Um 
diesem Mangel abzuhelfen, werden in dem vorliegenden Beitrag mittels einer 
Korpusuntersuchung diejenigen Verben identifiziert, die im Afrikaans - obligato- 
risch oder optional - in der IPP-Form auftreten. Wir vergleichen die Verbklassen, 
die auf Afrikaans, Niederländisch und Deutsch den IPP-Effekt zeigen, und stellen 
Ähnlichkeiten und Unterschiede zwischen den Sprachen fest. Das Ergebnis ist 
eine korpusbasierte Typologie von IPP-Verben in den drei betroffenen Sprachen. 


ð Open Access. © 2020 Dirix/Augustinus/Van Eynde, published by De Gruyter. JEA This work is 
licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. 
https://doi.org/10.1515/9783110668476-005 
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1 Introduction 


In the West-Germanic languages the auxiliary of the perfect combines with a past 
participle (PSP), as illustrated in (1-6) for German (DE), Dutch (NL), Afrikaans (AF), 
English (EN), West Frisian (FY), and Yiddish (YI).! 


(1) DE Wir haben nichts gesehen. 
(2) NL Wij hebben niets gezien. 
(3) AF Ons het niks gesien nie. 

(4) EN Wehave not seen anything. 
(5) FY Wy hawwe neat sjoen. 

(6) YI Mir hobn gornisht gezen. 


If the participle combines with an infinitival complement, though, it may and 
sometimes must take the form of an infinitive, as illustrated in (7-9). 


(7) DE Ich habe Johann einen Roman schreiben sehen/gesehen. 
(8) NL Ik heb Johan een roman zien/*gezien schrijven. 
(9) AF  Ekhet Johan ’n roman sien/gesien skryf. 

‘Tsaw John write a novel.’ 


The phenomenon is known as infinitivus pro participio (IPP).’ It occurs in German, 
Dutch and Afrikaans, but not in English, Frisian or Yiddish (Wurmbrand 2004; 
Schmid 2005: 138). It is an interesting topic for comparative research, because it 
shows a large degree of variation. Notice, for instance, that the Dutch zien in (8) 
must take the IPP form, while its counterparts in German and Afrikaans may take 
either the usual participial form or the IPP form. 

While German and Dutch IPP have been studied extensively, Afrikaans IPP 
has received far less attention. The first objective of this study therefore is to iden- 
tify the verbs which - obligatorily or optionally — take the IPP form in Afrikaans. 
For this purpose we make use of two corpora. The second objective is to compare 
the results with those for Dutch and German. In answering this question, we use the 
typology of IPP verbs that has been proposed in Augustinus/Van Eynde (2017). 


1 We thank the audience of the Germanic Sandwich conference (March 2017, Miinster), the anony- 
mous reviewers and Alexander Hurst for their comments. 
2 Other terms are Ersatzinfinitiv and substitute infinitive, see Den Besten/Edmondson (1983). 
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The paper starts with a survey ofthe literature on IPP in Afrikaans in section 2. 
In section 3 we present the corpus study. Section 4 describes the typology of 
Augustinus/Van Eynde (2017) and compares it to other lists and classifications for 
German and Dutch IPP verbs. In addition, an extension to the typology is pro- 
posed, making use of the data from the corpus study. Section 5 summarizes our 
findings. 


2 IPP in Afrikaans 


This section provides a brief survey of the literature on IPP in Afrikaans. To pave 
the way we first present some facts about the verbal paradigm (2.1). Then we dis- 
cuss the constructions in which the IPP effect occurs (2.2). A separate section is 
devoted to the modal verbs (2.3). 


2.1 The verbal paradigm 


Afrikaans verbs do not show much inflectional variation. The finite forms do not 
show any variation for person or number and the present/past distinction is — 
for most verbs — not morphologically marked (De Vos 2005). Instead the past is 
marked by the perfect auxiliary het ‘have’ in combination with the past partici- 
ple, as illustrated for the verb bly ‘to stay’. 


ek bly ‘I stay’ ek het gebly ‘I stayed’ 

jy/u? bly ‘you (sing.) stay’ jy/u het gebly ‘you (sing.) stayed’ 
hy/sy/dit bly ‘he/she/it stays’ hy/sy/dit het gebly ‘he/she/it stayed’ 
ons bly ‘we stay’ ons het gebly ‘we stayed’ 

julle/u bly ‘you (plur.) stay’ julle/u het gebly ‘you (plur.) stayed’ 
hulle bly ‘they stay’ hulle het gebly ‘they stayed’ 


The past participle canonically has the prefix ge-, but this prefix is omitted if the 
verb already has an unstressed prefix, such as be-, ge-, or ver-, as shown in jy het 
begin ‘you started’. The only verbs with a morphologically marked past tense, 
henceforth called preterite, are the copula and most of the modals. 


3 Jyis the informal form. The polite form u is hardly ever used in present-day Afrikaans. 
4 A form like gebegin is sometimes observed in colloquial language. 
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ek is ‘Iam’ ek was ‘I was’ 

ek kan ‘I can’ ek kon ‘I could’ 

ek wil ‘I want’ ek wou ‘I wanted’ 

ek sal Twill’ ek sou ‘I would’ 

ek moet ‘I must (pres.)’ ek moes ‘I must (past)’ 
ek mag ‘I may’ ek mog ‘I might” 


The infinitive has the same form as the simple present tense, except in the case of 
the copula wees ‘be’ and the main verb hê ‘have’. They contrast respectively with 
is and het (Donaldson 1993). 

The passive is formed by the combination of a passive auxiliary and a past 
participle. The auxiliary is word ‘become’ for the present and is ‘be’ for the past. 
Compare the active sy slaan hom ‘she hits him’ and sy het hom geslaan ‘she hit 
him’ with the passive counterparts hy word deur haar geslaan ‘he is hit by her’ and 
hy is deur haar geslaan ‘he was hit by her’. In colloquial speech one also hears 
was (instead of is), but this is considered an Anglicism, being a double past. 


2.2 IPP 


In Afrikaans, IPP mainly occurs in the double infinitive construction (2.2.1) and in 
pseudo-coordination (2.2.2). A special case is IPP in passive constructions (2.2.3). 


2.2.1 Double infinitive construction 


As pointed out in the introduction, the perfect auxiliary canonically combines 
with a past participle, as in (3) and (10), but if the participle selects an infinitival 
complement, it may take the IPP form instead, as in (9) and (11). 


(10) Hy het stil gebly/*bly. 
he has quiet remain.PSP/remain.INF 
‘He remained quiet.’ 


5 Mog ‘might’ is archaic and does not appear anymore in contemporary Afrikaans. The verb hé 
‘have’ used to have a preterite form (had), but this is now replaced by het gehad. The modals hoef 
‘need to’ and behoort ‘ought to’ do not have a preterite form. Their past counterparts are formed 
with the auxiliary het. 

6 For the homophonous auxiliary the infinitive is identical to the simple present, i.e. het. 
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(1) Hy het gebly/bly praat. 
he has remain.PSP/remain.INF talk.INF 
‘He kept talking.’ 


Ponelis (1979), De Vos (2001) and Zwart (2007) claim that the choice for the IPP 
form is optional, but Donaldson observes that it is often preferred in the contexts 
where it is allowed and that the use of the past participle in (10) is typical of collo- 
quial Afrikaans (Donaldson 1993: 225f.). If the IPP form is chosen, one gets a 
sequence of two infinitives, hence the term double infinitive construction. 

A list of verbs that may take the IPP form is provided in Ponelis (1979). It 
includes the aspectual verbs begin ‘begin’, bly ‘stay’, aanhou ‘continue’, ophou 
‘stop’, gaan ‘go’, kom ‘come’ and loop ‘walk’, the causative laat ‘let’, and the 
control verbs help ‘help’, leer ‘learn, teach’ and probeer ‘try’. They all combine 
with a bare infinitive, except for begin, which can also have an infinitive intro- 
duced by te ‘to’. 

Van Schoor (1983) contains a longer list, which also includes kry ‘get’, durf 
‘dare’, the causative maak ‘make’, the evidential modals skyn ‘seem’ and blyk 
‘turn out’, and the perception verbs hoor ‘hear’, sien ‘see’, voel ‘feel’ and ruik 
‘smell’. According to Van Schoor, IPP is obligatory for all of these verbs, except for 
aanhou and ophou. 


Table 1: IPP versus PSP (Robbers 1997) 


IPP PSP 
aanhou ‘continue’ 0 (0%) 0 (0%) 
begin ‘begin’ 0 (0%) 0 (0%) 
bly ‘remainc 17 (94.44%) 1 (5.56%) 
gaan ‘go’ 71 (100%) 0 (0%) 
help ‘help’ 1 (33.33%) 2 (66.67%) 
hoor ‘hear’ 2 (66.67%) 1 (33.33%) 
laat ‘let’ 49 (98%) 1 (2%) 
leer ‘learn/teach’ 5 (100%) 0 (0%) 
sien ‘see’ 16 (94.12%) 1 (5.88%) 
voel ‘feel’ 1 (100%) 0 (0%) 


Total 162 (96.43%) 6 (3.57%) 
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Robbers (1997) is the first to provide quantitative data. For a subset of the IPP 
verbs she counted the frequency of IPP and non-IPP (i.e. PSP) occurrences in a 
128K-word corpus, consisting of (parts of) three novels.’ The results are presented 
in Table 1. 

This survey clearly shows that IPP is preferred when it is possible, confirming 
the claim in Donaldson (1993). The only exception is help, but notice that there 
are only three occurrences of this verb in the data set. 

Note that Conradie (2006) claims there is no such thing as IPP in Afrikaans. 


(12) Hy het’n liedjie laat /gelaat sing. 
he hasa song let.INF /let.PSP sing.INF 
‘He made them sing a song.’ 


In a construction like (12), he analyses laat sing as a single verb, and considers 
(ge)laat sing as a single past participle with optional ge- prefixation. He does not 
give any arguments for this analysis, though. According to Ponelis (1993), also cited in 
Robbers (1997), a small number of these combinations of a linking verb and a main 
verb have become lexicalized, e.g. gaan haal ‘fetch’, laat blyk ‘indicate’, laat geld 
‘exercise authority’, laat kom ‘summon’, laat spaander ‘get going’ and laat staan 
‘leave’. Most combinations, however, are not lexicalized as the verb clustering is a 
productive process and there is no specialization of meaning. The present tense 
version is “Hy laat ’n liedjie sing”, making it hard to claim laat sing is a single verb. 


2.2.2 Pseudo-coordination 


Another construction in which IPP forms occur concerns a serialization pattern 
with the conjunction en ‘and’. It is used to express the continuous or progressive 
aspect, as in (13). 


(13) Ons staan en luister. 
we stand.PRES and listen. INF 
‘We are listening.’ 


Since the conjunction does not have its usual coordinating function here, the 
construction is known as pseudo-coordination. It also exists in English (e.g. he sits 


7 Robbers’ corpus study is based on novels of Van Heerden (1987, first 100 pages), Van Niekerk 
(1994, first 100 pages) and Brink (1995, first 160 pages), but she does not mention the titles herself. 
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and reads), but not in Dutch or German. If staan is combined with the perfect 
auxiliary, it optionally takes the IPP form. The verb after en, by contrast, must 
take the infinitival form, as shown in (14). 


(144) Ons het gestaan/staan en _|luister/*geluister. 
we have stand.PSP/stand.INF and listen.INF/listen.PSP 
‘We were listening.’ 


Both options are considered standard Afrikaans by Ponelis (1979: 241-245), Donald- 
son (1993), Zwart (2007), and Verdoolaege/Van Keymeulen (2010). Robbers (1997) 
points out that en may be omitted, as in (15-16). 


(15) Sy staan ril effe. 
she stands tremble.INF momentarily 
‘She is trembling for a minute.’ 


(16) Sy het gestaan/staan ril effe. 
she has stand.PSP/stand.INF tremble momentarily 
‘She was trembling for a minute.’ 


Notice that the variant with the IPP form in (16) has the same form as the double 
infinitive construction. 

Beside staan ‘stand’, the verbs which occur in this construction are lê ‘lie’, 
loop ‘walk’, and sit ‘sit’.® Notice that this use of loop is different from the one that 
it has in the double infinitive construction. Robbers (1997) provides some quanti- 
tative data, see Table 2. 


Table 2: IPP versus PSP in pseudo-coordination (Robbers 1997) 


IPP PSP 
lê ‘lie’ 2 (50%) 2 (50%) 
loop ‘walk’ 0 (0%) 2 (100%) 
sit ‘sit’ 2 (16.67%) 10 (83.33%) 
staan ‘stand’ 3 (15%) 17 (85%) 
Total 7 (22.58%) 31 (77.42%) 


8 Ponelis (1979) points out that the infinitival marker te ‘to’ appears instead of en in a very limited 
number of cases. Such constructions are archaic. 
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In contrast to what we observed for the DIC, we see a preference for the non-IPP 
combination. 


2.2.3 Passive IPP constructions 


De Vos (2001) reports that some of the IPP verbs, esp. laat ‘let’, tend to passivize 
fairly productively. Examples are (17-18). 


(17) Hierdie huis is deur my oom laat /gelaat bou. 
this house is by my uncle let.PSP /let.INF build 
‘My uncle had this house built.’ 


(18) ’n stuk hout wat daarheen laat sak word 
a piece wood that to-there let.INF lower.INF is 
‘a piece of wood that is lowered to that place’ 


Robbers (1997: 61-64) also provides examples with kom ‘come’, ophou ‘stop’ and 
begin ‘begin’. If the embedded infinitive is a transitive verb, such as bou ‘build’, 
its object is identified with the subject of the passive auxiliary, as in (17). If the 
embedded infinitive is intransitive, it is its subject which is identified with the sub- 
ject ofthe passive auxiliary, as in (18).? 

According to Ponelis (1979), complex passives are not acceptable in pseudo- 
coordination. De Vos (2001) also points out that, although speaker judgements 
vary, it is difficult to passivize constructions with pseudo-coordination. Robbers 
(1997), however, had two informants who accept the construction, and Breed 
(2012) cites a few examples from the internet, such as (19). 


(19) ? Dieappelword deur hom gesit en eet. 
The apple is by him sit.PSP and eat.INF 
‘He sits and eats the apple’ 


Notice that sit has the PSP form. Breed’s examples all contain transitive verbs, 
such as eet ‘eat’. 


9 Constructions like (17) are impossible in Dutch and German. However, some speakers of Ger- 
man allow the remote passive construction, e.g. Weil der Wagen oft zu reparieren versucht wurde. 
‘Because many attempts were made to repair the car.’ (Miiller 2002: 136). In those constructions, 
the object of the embedded verb is realized as the subject of the selecting verb (in this case ver- 
suchen ‘try’). In German, these constructions do not show the IPP effect. 
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2.2.4 Summing up 


The literature on Afrikaans discusses a limited set of verbs that figure in IPP con- 
structions. They concern the double infinitive construction, pseudo-coordination 
and their passive counterparts. The literature is inconclusive about the issue of 
which verbs belong to the set (each author has his/her own set) and whether IPP 
is optional or obligatory. With a more extensive corpus study we aim to shed some 
more light on this issue, see section 3. 


2.3 Modal verbs 


The modal verbs are special in two respects. First, they are - together with the 
copula - the only verbs with a morphologically marked past tense, see 2.1. Second, 
they lack a past participle form. From this it follows that the past tense counterpart 
of a sentence with a modal verb is formed in another way than in the case of the 
other verbs. As an example, let us take the sentence Jan kan hard werk ‘John can 
work hard’. According to Robbers (1997) it has no less than five possible past 
tense counterparts. The most common one is (20), in which the modal has the 
preterite form.’° 


(20) Jan kon hard werk. [pret, - inf] 


The alternatives all contain the auxiliary of the perfect het.” 


(21) Jan het hard kon werk [aux, — pret, - inf] 
(22) Jan het hard kan werk [aux, - pres, — inf,] 
(23) Jan kon hard gewerk het [pret, - psp, — aux,] 
(24) Jan kan hard gewerk het [pres, — psp, — aux,] 


In (21) the auxiliary is combined with the preterite form kon, which in turn selects 
the infinitive werk. Robbers calls it archaic. In (22) the auxiliary selects the present 


10 In the added annotation aux stands for the auxiliary het ‘have’, pres for the present tense of 
the modal, pret for the preterite form of the modal, inf for the infinitival form of the main verb, 
and psp for the participial form of the main verb. The subscripts indicate the order of selection, 
with the hierarchically highest verb being 1. 

11 Ponelis (1979) calls the construction in (21) preteritive assimilation and the one in (23) preteri- 
tive replacement. 
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form kan which selects werk. Robbers doubts whether it is well-formed. In (23) 
and (24) the main verb takes the form of the past participle which is selected by 
the auxiliary het; the modal appears in its preterite form in (23) and in its simple 
present form in (24). Ponelis (1993: 412) suggests that the preterite form of the 
modal in (21) crossed over from finite to infinitival territory. 

We find the same possibilities in subordinate clauses, but with another word 
order. 


(25) _ ... dat Jan hard kon werk [pret, - inf,] 

(26) ..datJan hard kon werk het [pret, - inf, - aux ] 
(27) ..datJan hard kan werk het [pres, - inf, - aux, ] 
(28) ... dat Jan hard kon gewerk het [pret, - psp, - aux,] 
(29) _ ... dat Jan hard kan gewerk het [pres, - psp, — aux,] 


For verbs that have identical forms for the infinitive and the past participle, such 
as gebeur ‘happen’ and verkies ‘prefer’, the [pret,—inf,—aux,] sequence in (26) is 
not distinct from the [pret —psp,—aux,] sequence in (28), and the [pres,—inf,—aux,] 
sequence in (27) is not distinct from the [pres —psp,—aux,] sequence in (29). 

An obvious question is whether there are any semantic differences between 
the five alternatives. Stell (2011: 162-165) points out that the patterns in (21) and 
(22) are only used with declarative (indicative) senses, whereas the canonical 
pattern in (20) and the patterns with the past participle in (23) and (24) are also 
used with conditional and hypothetical senses, including the irrealis. 


3 Corpus study 


For the corpus study we have used two corpora: the Taalkommissie corpus and 
the Afrikaans section of Wikipedia. 

The Taalkommissie corpus is a 54 million-word corpus of written Afrikaans, 
developed by the Centre for Text Technology (CTexT 2011) of North-West Univer- 
sity’s Potchefstroom campus under the auspices of the language committee of 
the Suid-Afrikaanse Akademie vir Wetenskap en Kuns (South African Academy of 
Science and Arts). It covers a wide selection of genres including newspapers and 
magazine articles, textbooks, law and government texts, Bible texts, and litera- 
ture excerpts. Unfortunately, the version of the corpus at our disposal does not 
contain any metadata, so it is not possible to track the origin of the sentences or 
to compare the different registers. The corpus was automatically tokenized, anno- 
tated with part-of-speech (PoS) tags and integrated in a search tool as described 
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in Augustinus/Dirix (2013). The tag set consists of 139 different tags, mainly based 
on morphosyntactic features (Pilon 2005). The author of the tagger claims an 
accuracy of 85.87% on a small data set, which is rather low compared to state- 
of-the-art PoS taggers for well-resourced languages. Schlünz (2010) reports an 
accuracy of 94.64%, but with a tag set reduced to only 17 different tags, mainly 
just part-of-speech. We decided to use the extended tag set, as the tags for the 
different verb forms are important to us. 

Wikipedia” is the world’s best known online encyclopaedia, available in about 
280 languages. Its Afrikaans section contained 40,820 articles with 13.2 million 
words at the moment of download (July 2016). After extracting the articles with 
WikiExtractor.py,? we applied the same tokenization and tagging procedure as we 
did for the Taalkommissie corpus. 

The following sections describe the results of the corpus investigation of both 
corpora. Section 3.1 presents the double infinitive constructions, section 3.2 pseudo- 
coordination, section 3.3 passive constructions, and section 3.4 constructions with 
modal verbs. Section 3.5 compares the results for the two corpora. Section 3.6 
concludes. 


3.1 Double infinitive constructions 


In order to collect double infinitive constructions and the corresponding con- 

structions with a past participle, we searched for the following constructions in 

the Taalkommissie and Wikipedia corpora:“ 

— Constructions in which the auxiliary het ‘have’ is immediately followed by 
two verbs (main clauses), 

— Constructions in which the auxiliary het ‘have’ is immediately preceded by 
two verbs (subordinate clauses), 

— Constructions in which there is another word between het ‘have’ and the two 
other verbs (main clauses). 


12 wwwwikipedia.org (last accessed: 12-6-2016). 

13 Downloadable from https://github.com/attardi/wikiextractor (last accessed: 12-6-2016). 

14 In Augustinus/Dirix (2013) a more limited corpus study of IPP in the Taalkommissie corpus 
was done. Constructions with modal verbs and verbs selecting a te infinitive were not considered 
in that study. The methodology for extracting double infinitive constructions and constructions 
with pseudo-coordination is similar to the methodology used for this study. 
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It is obviously possible that more than one word occurs between het and the two 
other verbs, see example (9) in the introduction, but we limited our research to 
constructions with no more than one word between het and the verbal group. 
Searching for constructions with more words in between resulted in data that 
were too noisy to manually classify.” 

The corpus searches resulted in more than 13,000 matches, which were man- 
ually checked and categorized. After filtering out the false positives, we retained 
7,713 hits, of which 7,632 contain an IPP verb. 

The results of the corpus investigation are presented in Table 3. The first 
column describes the verb types. The categorization is taken from Augustinus/ 
Van Eynde (2017).?° The classification is based on the distinction between subject- 
and object-oriented VP selectors. In IPP constructions with a subject-oriented verb, 
the unexpressed subject of the infinitival complement is identified with the sub- 
ject of the IPP, whereas in IPP constructions with an object-oriented verb the 
unexpressed subject is identified with the (in)direct object of the IPP. An example 
of a subject-oriented verb is try in He tried to cheat on us, whereas made in She 
made him smile is an instance of an object-oriented verb. The subject-oriented 
verbs are further divided into aspectual, evidential and subject-control verbs, 
while the object-oriented verbs are partitioned into causative, perception and 
benefactive verbs. 

The second column of Table 3 mentions the lemma of the verbs. The third and 
fourth column contain the corpus results from the Taalkommissie corpus and 
Wikipedia respectively. The fifth column contains the row totals, and the sixth 
column mentions the percentage of hits that contain an IPP construction for the 
verb under investigation. Within the corpus results (third and fourth column), 
the first column represents the verb’s occurrence as an IPP verb selecting a bare 
infinitive (IPP), the second column indicates how often it occurs as an IPP verb 
selecting a te infinitive (IPP + te), the third column indicates how often the verb 
occurs in a construction with a past participle (PSP). 


15 Using a treebank would solve this problem, but unfortunately the only treebank at our dis- 
posal, AfriBooms (Augustinus et al. 2016; Dirix et al. 2017), turned out to be too small to inves- 
tigate the IPP phenomenon. Therefore, we decided to use flat corpora, but with a restriction on 
the length of the constructions under investigation. 

16 Insection 4 we will compare the results of this study to the typology of IPP verbs proposed in 
Augustinus/Van Eynde (2017). 


IPP in Afrikaans: a corpus-based investigation — 121 


Table 3: Double infinitive constructions and corresponding constructions with a past participle 
in the Taalkommissie corpus and Wikipedia 


Type Verb Taalkommissie corpus Wikipedia Total % IPP 
IPP IPP+te PSP IPP IPP + te PSP 
aspectual begin ‘begin’ 1,461 1 1 713 1 1 2,178 99.91 
gaan ‘go’ 853 0 0 200 0 0 1,053 100.00 
kom ‘komen’ 645 0 17 147 0 0 809 97.90 
bly ‘continue’ 270 0 1 87 0 0 358 99.72 
oO 
£ aanhou ‘continue’ 45 0 6 24 0 3 78 88.46 
v 
2 ophou ‘stop’ 16 0 6 10 0 0 32 81.25 
© 
o 
2 subject probeer ‘try’ 575 0 1 172 0 0 748 99.87 
wn 
control 
durf ‘dare’ 35 0 1 4 0 0 40 97.50 
leer ‘learn’ 24 0 6 3 0 5 38 71.05 
weet ‘manage’ 0 3 0 0 0 0 3 100.00 
evidential blyk ‘turn out’ 0 3 4 0 3 2 12 50.00 
causative laat ‘let’ 1,502 0 2 507 0 0 2,011 99.90 
= maak ‘make’ 10 5 1 0 0 7 28.57 
wo 
S perception sien ‘see’ 130 0 7 9 0 1 147 94.56 
2 verbs 
D hoor ‘hear 4 0 0 o o0 0 4 100.00 
2 
z benefactive help ‘help’ 111 0 8 70 0 3 192 94.27 
verbs 
leer ‘teach’ 2 0 1 0 o0 0 3 66.67 
Total 5,674 7 66 1,947 4 15 7,713 98.95 


The results in Table 3 show that there are very few examples without IPP: 7,632 
constructions or 98.95% of the matches contain an IPP construction. For 11 of the 
17 verbs the percentage of IPP forms is higher than 90% and for three of those it 
is even 100%. Examples and more detailed discussion are given in 3.1.1 for the 
subject-oriented and in 3.1.2. for the object-oriented VP selectors. 

We also observed there are 84 examples of clusters with three verbs in the 
cluster and 4 examples with four verbs. All of these have IPP,” as in example (30): 


17 This is also true in Dutch for clusters with three verbs or more. 
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(30) Sysou graag meermale wou kom speel het, 
she would happily more-times want.PRET come.INF play.INF have.INF 
maar die opsigter gee haar die kreeps. 


but the supervisor gives her the creeps 
‘She would have come over to play more often, but the supervisor gives her 
the creeps.’ 


3.1.1 Subject-oriented VP selectors 


The set of aspectual verbs is the largest category, both with respect to verb types 
(6) and tokens (4,508). They add up to 58.45% of the 7,713 constructions under 
investigation. Some examples are given in (31-32)."? 


(31) Hy hetgaan draf. 
he has go.IPP run 
‘He went running.’ 


(32) Sy het begin huil en het my uitgesit omdat ek dronk was. 
she has begin.IPP cry and has me out-thrown because I drunk was 
‘She started crying and threw me out because I was drunk’ 


The verbs begin ‘begin’, gaan ‘go’, kom ‘come’, and bly ‘continue’ appear as IPP 
in more than 97% of the cases. The verbs aanhou ‘continue’ and ophou ‘stop’ have 
a slightly higher percentage of constructions with a past participle, cf. (33-34). 


(33) Ekhet vir hom gesê die kluis staan oop en dis leeg, maar hy 
I have for him said thesafe stands open and this-is empty, but he 
het aangehou vra. 
has keep.PSP ask.INF 
‘I said to him the safe was open and empty, but he kept asking.’ 


(34) Hy het opgehou oefen na sy motorongeluk 20 jaar gelede. 
he has stop.PSP practise after his car-accident 20 years ago 
‘He stopped practicing 20 years ago after his car accident.’ 


18 All examples in this section are taken from the Taalkommissie corpus. We consider the use of 
begin in (32) to be an IPP, rather than a PSP. 
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Note that those verbs are the only verbs with a separable particle encountered in 
the data set. 

The figures in the PSP column of the Taalkommissie corpus include several 
combinations with two participles. For kom ‘come’ this is the case for 12 out ofthe 
17 PSP constructions, cf. example (35). 


(35) ’n Vragmotor wat indie teenoorgestelde rigting aangery 
a truck which inthe opposite direction drive.PSP 
gekom het... 
come.PSP has 
‘A truck which came from the opposite direction ...’ 


The literature does not mention this construction.” Our consultants consider them 
ill-formed or very colloquial.*° This might explain why we did not encounter this 
construction in the Wikipedia corpus. 

The subject control verbs account for 10.75% of the constructions under 
investigation. An example is given in (36). 


(36) Hy het probeer terugbaklei, maar isin’n stoeiery in die bors 
he has try.INF fight-back.INF but isina struggle inthe chest 
geskiet ... 
shot 
‘He tried to fight back, but was shot in the chest during a struggle ...’ 


In contrast to Dutch proberen ‘try’, which is an optional IPP verb, the Afrikaans 
probeer has a high preference for IPP; we have encountered only one construction 
with a past participle and it clearly contains colloquial language (37). 


(37) oom Gert Wiese het geprobeer paai en gesê: “Broer, laat ons 
uncle Gert Wiese hastry.PSP appease.INF and say.PSP brother let us 
tog maar’ie vrede bewaar en da moet ons ôk 
PRT but-the.COLL peace keep and then.COLL must we also.COLL 
onthou ons is nog altyd innie voorhowe vannie 
remember we are still always in-the.COLL forecourts of-the.COLL 


19 We also encountered one instance of a double participle construction for the verb bly ‘continue’ 
and two for the verb sien ‘see’ in the Taalkommissie corpus. 
20 We have asked five native speakers whether they could use such constructions. 
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tempel.” 

temple 

‘Uncle Gert Wiese tried to appease him and said: “Brother, let us keep the 
peace and we must also keep in mind that we are still in the forecourts of 
the temple.” 


Similar to the aspectual verbs IPP is the preferred construction for Afrikaans sub- 
ject control verbs. Only in the case of leer ‘learn’ have we encountered a substantial 
percentage (nearly 30%) of constructions with a past participle, as in (38). 


(38) 


Sy het geleer fantaseer, en aanhou fantaseer, deur haar 
she has learn.PSP fantasize.INF, and keep.INF fantasize.INF through her 
skooljare, deur haar studentedae ... 

school-years through her student-days 

‘She learned to fantasize and kept fantasizing, during her years at school, 
during her days at college ...’ 


The examples with geleer are definitely not colloquial, so leer can be considered 
a truly optional IPP verb in Afrikaans. In Wikipedia, the use of the past participle 
is the preferred option, but note that the overall frequency of the verb is too small 
to draw hard conclusions on this matter. 


The verb weet ‘know’ is one of the few verbs that select a te infinitive in IPP 


constructions, cf. example (39). 


(39) 


Dis sy ma, het die man weet te vertel, wat hom so 
this-is his mother, has the man know.INF to tell.INF, which him so 
genadeloos gedruk het toe hynog jonk was. 

relentlessly pressured has when he still young was 

‘The man managed to say that it is his mother who pressured him relent- 
lessly when he was still young.’ 


Another verb that exclusively selects a te infinitive is the evidential verb blyk 
‘turn out’, as used in (40). 


(40) 


Dit het heel doeltreffend blyk tewees en die masjien het die 
this has very efficient seem.INF to be.INF and the machine has the 
eerste in Groot-Brittanje (deur Reynolds) vrygestel en het groot treé 

first in Great-Britain (by Reynolds) released and has great steps 
vorentoe in die fietstegnologie getoon. 

forward in the bicycle-technology showed 
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‘It seemed very efficient and the machine released the first one in Great 
Britain (by Reynolds) and showed a big step forward in bicycle technology.’ 


Blyk is the only verb in this category. It occurs both as IPP and past participle, so 
IPP is clearly optional for this verb, but note that the overall frequencies of blyk 
are low in both corpora (0.16% of the constructions under investigation). As an 
IPP verb it only appears in the construction blyk te wees ‘appear to be’. Other 
verbs that would fit into this category are lyk ‘seem’ and skyn ‘appear’. Those verbs 
appear in the corpus, but not in combination with het.” 


3.1.2 Object-oriented VP selectors 


Within the set of object-oriented VP selectors the causatives form the largest 
category (2018 hits, 26.16%). Most of the constructions are instances of laat ‘let’ 
(41), but we also found a few examples with maak ‘make’ in the sense of ‘make 
someone do something’, as in (42). 


(41) Na ’ndol dag op die filmstel, met ’n regisseur wat die een toneel na 
after a crazy day on the film-set, with a director which the one scene after 
die ander tot vervelens toe laat herhaal het... 
the other until boring to let.INF repeat.INF has 
‘After a crazy day on the film set, with a director who made us repeat one 
scene after another ad nauseam ...’ 


(42) Haal die Haaie die eindstryd, sou dit wees omdat hulle tereg 
reach the Sharks the finals, would this be because they rightly 
hul veldtog op die basiese hoekstene van die spel. maak 
their campaign on the basic cornerstones of the game make.INF 
staan het. 
stand.INF have 
‘If the Sharks reach the finals, it would be because they rightly based their 
campaign on the fundamentals of the game’ 


21 The Dutch cognates blijken ‘turn out’, schijnen ‘appear’, and lijken ‘seem’ show similar proper- 
ties: constructions with IPP are hard to find as those verbs are rarely combined with the auxiliary 
of the perfect (Broekhuis/Corver 2015: 621). 
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Maak ‘make’ is the only verb in Table 3 for which the IPP hits are outnumbered 
by the PSP hits (28.57%). Notice that its Dutch cognate maken is not used as an 
IPP at all. 

The perception verbs hoor ‘hear’ and sien ‘see’ account for a meagre 1.96% 
of the constructions. Examples are given in (43) and (44). 


(43) .. sodat sy hulle nooit hoor terugkom het nie. 
. so-that she them never hear.INF return.INF has not 
<... so she never heard them return’ 


(44) omdat sy hom nog nooitso sien glimlag hetnie. 
because she him yet never so see.INF smile.INF has not 
‘because she had never seen him smile that way’ 


Both verbs clearly prefer IPP. For hoor we did not find any instances of PSP, and 
for sien just a few, including (45). 


(45) ’n ADT-veiligheidswag wat die voorval gesien gebeur het, hetsy 
a ADT-security-guard which the event see.PSP happen.INF has,has his 
noodknoppie gedruk om bystand te ontbied. 
emergency-button pressed for assistance to send-for 
‘An ADT security guard who witnessed the incident, pushed the emer- 
gency button to call assistance.’ 


The set of benefactives concludes the typology. The 195 hits account for 2.53% 
of the constructions. Its members leer ‘teach’ and help ‘help’ both occur as IPP 
constructions and with a participle: 


(46) Emily Petlo, ’n buurvrou wat Klyn help skoonmaak het, sê die 
Emily Petlo, a neighbour which Klyn help.INF clean.INF has, says the 
water in haar toiletbak staan ook hoog. 
waterin her toilet stands also high 
‘Emily Petlo, a neighbour who helped Klyn to clean, said the water in her 
toilet was also high.’ 


(47) Netsoosmypa my geleer werk het, so leer ek my kinders. 
just like my father me teach.PSP work.INF has, so teach I my children 
‘Just as my father taught me how to work, I am teaching my children.’ 
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3.1.3 Summing up 


The results in Table 3 show that the aspectual verbs are the prototypical IPP verbs 
in Afrikaans, followed by causative laat ‘let’ and the set of subject control verbs. 
Evidential blyk ‘turn out’, causative maak ‘make’ and benefactive leer ‘teach’ occur 
in less than 70% of the hits in the IPP form, but note that the absolute frequencies 
for those verbs are very low. In general, the double infinitive construction with 
IPP is by far the preferred construction compared to constructions in which a past 
participle is selected. 

If we compare the verbs in Table 3 to the verbs mentioned in section 2, we see 
that the results from the corpus study include all verbs mentioned in Ponelis 
(1979), except for loop ‘run’. Van Schoor (1983) also mentions the verbs kry ‘get’, 
voel ‘feel’, and ruik ‘smell’ as IPP verbs, but they were not retrieved in the dataset. 
Robbers (1997) encountered one instance of voel ‘feel’ in her corpus study. The 
non-occurrence of voel and ruik in the Taalkommissie and Wikipedia corpora may 
be due to data sparseness, but for kry this is unlikely. Kry consistently occurs as a 
past participle in combination with a verbal complement. Weet ‘manage’ is the 
only verb encountered in the dataset that is not mentioned in the literature. 


3.2 Pseudo-coordination 


A second corpus investigation concerns IPP constructions with pseudo-coordi- 
nation. We looked for similar constructions as in section 3.1, but with the conjunc- 
tion en ‘and’ between the two verbs that precede or follow the auxiliary het ‘have’. 
This resulted in more than 1,900 hits, but after filtering out the false positives, 
only 248 examples were retained. Table 4 presents the results. Although the litera- 
ture claims en is optional, we did not find any occurrences without en, neither 
with nor without IPP.” 


22 According to Robbers (1997) such constructions are grammatical, see example (16). In the 
Taalkommissie corpus we found a number of examples in present tense constructions, e.g. Die 
ouer span staan praat in groupies ‘The older team was talking in small groups’, but none in 
combination with het. 


128 — Peter Dirix/Liesbeth Augustinus/Frank Van Eynde 


Table 4: IPP in constructions with pseudo-coordination in the Taalkommissie corpus and 
Wikipedia 


Verb Taalkommissie Wikipedia Total % IPP 
IPP PSP IPP PSP 

sit ‘sit’ 48 59 2 1 110 45.45 

staan ‘stand’ 45 47 3 2 97 49.48 

lê ‘lie’ 30 5 0 0 35 85.71 

loop ‘run’ 1 5 0 0 6 16.67 

Total 124 116 5 3 248 52.02 


The counts in Table 4 show that IPP is optional in constructions with pseudo- 
coordination. This is in accordance with the literature on these constructions. 
The verbs occurring in constructions with pseudo-coordination are all aspectual 
verbs which refer to posture (sit, staan and lê) or movement (loop).?? The verb lê 
‘lie’ clearly prefers IPP (about 85% of the constructions), but for sit ‘sit’, staan 
‘stand’ and loop ‘walk’ the PSP form is a valid alternative, as less than 50% of the 
constructions appear with IPP. Some examples are given in (48)-(49). 


(48) Sy klap die boek toe waarin sy lê en lees het, en 
she* smacked the book closed in-which she lie. INF and read.INF has, and 
draai na Mara toe. 
turned to Mara to 
‘She closed the book which she was reading and turned to Mara.’ 


(49) En ons almal het gesit en wag op die droogte wat voorspel 
and we all have sit.PSP and wait.INF on the drought which predicted 
iS... 
is 
‘And we were all waiting for the drought that was predicted ...’ 


23 Broekhuis/Corver (2015) call this set of verbs semi-aspectuals, in order to differentiate them 
from the core aspectual verbs, which have different characteristics in Dutch. Considering the fact 
that the aspectual verbs mentioned in this section may appear in constructions with pseudo- 
coordination, as opposed to the aspectual verbs mentioned in section 3.1, the distinction makes 
sense for Afrikaans as well (cf. section 4). 


IPP in Afrikaans: a corpus-based investigation — 129 


3.3 Passive constructions 


As mentioned in section 2, Afrikaans allows for the passivization of IPP construc- 
tions. We have encountered some examples in the data for 5 out of the 17 verbs 
mentioned in Table 3. Table 5 presents an overview. 


Table 5: Passive IPP constructions in the Taalkommissie corpus and Wikipedia 


Verb Taalkommissie corpus Wikipedia Total % IPP 
word +IPP_ wees +IPP PSP word+IPP wees+IPP PSP 

laat ‘let’ 44 97 0 7 78 0 226 100.00 
begin‘begin 82 0 0 14 15 0 111 100.00 
probeer ‘try’ 9 3 0 1 1 0 14 100.00 
kom ‘come’ 0 1 0 0 0 0 1 100.00 
help ‘help’ 0 1 0 0 0 1 2 50.00 
Total 135 102 0 22 94 1 354 99.72 


The figures in Table 5 show that passive IPP constructions are far less frequent 
compared to their active variants, but we find instances both in the present (with 
auxiliary word), cf. (50) and in the past (with auxiliary wees), cf. (51). 


(50) 


(51) 


Indien die voorwerp laat val word, val dit onder die 
if the object let.INF drop.INF is, drop this under the 
invloed van gravitasie vloer toe. 

influence of gravitation floor to 

‘If the object is dropped, it falls to the floor due to gravitation.’ 


.. n meganistiese hooftendens waarin alle fisiese verskynsels 
..a mechanistic main-tendency in-which all physical phenomena 
konsekwent probeer herlei is tot ’n kinematiese perspektief. 
consistently try.INF reduce.INF isto a kinematic perspective 

‘„. a principal mechanistic tendency in which (physicists) consistently tried 
to reduce all physical phenomena to a kinematic perspective.’ 


In passive constructions, IPP is obligatory for the majority of the verbs. The only 
instance in which help ‘help’ does not have IPP is given in (52). 
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(52) die Nowikof-telegram wat deur die Sowjet-ambassadeur aan die VSA 
the Novikov-telegram which by the Soviet-ambassador to the USA 


gestuur is, maar deur Wijatsjeslaf Molotof gelas en gehelp 
sent is but by Vyacheslav Molotov order.PSP and help.PSP 
skryf is... 

write. INF is 


‘the Novikov telegram, which was sent by the Soviet ambassador to the 
USA, but was ordered and helped to be written by Vyacheslav Molotov...’ 


Robbers (1997) claims that it is also possible to passivize constructions with ophou 
‘stop’, but we did not find any corpus evidence for that verb. Furthermore, we 
did not encounter any passive constructions with pseudo-coordination in the 
corpus. 


3.4 Constructions with modal verbs 


As explained in section 2.3, Afrikaans modal verbs have a morphologically marked 
past tense. We have searched the Taalkommissie and Wikipedia corpora for past 
constructions with modal verbs, taking into account the alternatives with auxiliary 
het ‘have’ in main and subordinate clauses. We have limited the queries to con- 
structions with adjacent verbs. Table 6 presents the filtered results for the Taal- 
kommissie corpus and Wikipedia, in which the canonical pret —inf, constructions 
(type kon werk) are separated from the alternatives with auxiliary het (e.g. het kon 
werk, kan gewerk het). Note that ‘~’ is used for principled non-occurrence and ‘0’ 
for accidental non-occurrence. 


Table 6: Constructions with modal verbs in the Taalkommissie corpus and Wikipedia 


Verb Taalkommissie Wikipedia Total % canonical 
Canonical with het Canonical with het 

kan ‘can’ 19,912 817 5,325 264 26,318 95.89 

sal ‘will’ 16,310 1,095 6,542 220 24,167 94.56 

moet ‘must’ 9,900 647 2,978 101 13,626 94.51 

wil ‘want? 7,164 254 972 21 8,411 96.73 


mag ‘may’ 6 48 6 17 77 15.58 
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Verb Taalkommissie Wikipedia Total % canonical 
Canonical with het Canonical with het 

hoef‘need to’ - 14 - 2 16 0.00 

behoort ‘ought to’ - 1 - 0 1 0.00 

Total 53,292 2,876 15,823 625 72,616 95.18 


The figures in Table 6 show that more than 95% of the constructions are instances 
of the canonical construction, which is in both the Taalkommissie corpus and 
Wikipedia clearly preferred over the alternatives with auxiliary het. An example 
with wou ‘wanted’ is given in (53). 


(53) wat ek wou vra, volgens watter resep kook jy jou 
what I want.PRET ask.INF according-to which recipe cook you your 
boontjiesop? 
bean-soup 


‘What I wanted to ask, what recipe do you use for your bean soup?’ 


Mag ‘may’ only occurs in 15% of the cases in the canonical preterite construction. 
Alternatives with het are more common (54), but note that the verb is very uncom- 
mon in comparison to the other modals. 


(54) Baie leerders besef eers in die eksamenkamer dat, alhoewel hulle baie 
many learners realize first intheexam-room that, although they very 
hard mag studeer het, hulle nie die uitkomste bereik het 
hard may.PRES study.INF have.INF, they not the results reached have 
wat vir die module gestel is nie. 
which for the module put isnot 
‘Despite the fact they might have studied very hard, a lot of learners only 
realize in the exam room they haven’t reached the outcome that was 
expected for the module.’ 


The verbs hoef ‘need to’ and behoort ‘ought to’ do not have a preterite form and, as 
a consequence, cannot occur in the canonical construction. Their number in the 
corpus results is very small, as those verbs are only used in formal contexts. 

In order to take a closer look at the alternative constructions, the results with 
het are split up with respect to the form and the order of the verbs in Table 7. The 
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ambiguous constructions (‘pres—inf/pp—aux’ and ‘pret-inf/pp-aux’) are put in 
separate columns. 


Table 7: Constructions with modal verbs and auxiliary het in the Taalkommissie corpus and 
Wikipedia 


N Mou N x Mou 

5 3 Sy 835 3 š zy E3 

aes Be A aes Ri at ats 

fa ~ 

a 223 #22 8% zfs 883 

ne SES Fie FE se: GE 
Taalkommissie a & PTS 4a & ao 2 TS io 

© § ees Sey gos es oe 
corpus as © as © as ax © ae © as Total 
kan ‘can’ 552 249 5 10 1 0 817 
sal ‘will’ 794 275 15 8 3 0 1,095 
moet ‘must’ 456 119 5: 61 6 0 647 
wil ‘want’ 228 14 3 4 1 4 254 
mag ‘may’ 0 0 0 38 8 2 48 
hoef ‘need to’ 0 0 0 14 0 0 14 
behoort ‘ought to’ 0 0 0 1 0 0 1 
Total 2,030 657 28 136 19 6 2,876 
% 70.58 22.84 0.97 4.73 0.66 0.21 100.00 

N Mou x Mou 

Soy Se Eš EE au 55 

T3 E ee zn af ate 

5 a 

23% se: 28: 85 ges 883 

Tè BES is TE BER: 878 

“ao YT S aa g ao VT Ò as 

2 § ees Sen 85 ESS 58% 
Wikipedia ax © as © as ax © as © as Total 
kan ‘can’ 184 71 8 0 0 1 264 
sal ‘will’ 147 65 4 3 1 0 220 
moet ‘must’ 80 15 1 4 1 0 101 
wil ‘want’ 19 2 0 0 0 0 21 
mag ‘may’ 0 0 0 9 7 t 17 
hoef ‘need to’ 0 0 0 2 0 0 2 
behoort ‘ought to’ 0 0 0 0 0 0 0 
Total 430 153 13 18 9 2 625 


% 68.80 24.48 2.08 2.88 1.44 0.32 100.00 
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Table 7 shows that all constructions mentioned in the literature were encountered, 
but their frequency differs markedly. The constructions can be divided into two 
main sets: the constructions in which the modal verb appears in its preterite (pret) 
form (lefthand side of the table), and the ones in which it appears in its simple 
present (pres) tense form (righthand side of the table). The results show that 
constructions belonging to the former set are more common in both corpora. 

Around 70% of the alternatives are instances of pret -psp,-aux, constructions 
(type kon gewerk het), which makes it the most common alternative for the canoni- 
cal construction. An example is given in (55). 


(55) Veel langer sou sy in ieder geval niehier kon gebly 
much longer would she in each case not here can.PRET stay.PSP 
het nie. 


have.INF not 
‘She would not have been able to stay here much longer anyway.’ 


By contrast, the constructions in which the auxiliary het selects a preterite modal 
verb (type het kon werk), only account for 0.97% of the constructions in the Taal- 
kommissie corpus and 2.08% in Wikipedia. The majority of the constructions that 
are ambiguous between those constructions (i.e. the pret-inf/psp-aux cases) are, 
hence, instances of the former.”* 

In comparison to the constructions with a preterite modal, the alternatives with 
a present form of the modal verb are uncommon. The pres -psp,-aux, instances 
(type kan gewerk het) account for 4.73% of the constructions in the Taalkommis- 
sie corpus, but they only make up 2.88% of the constructions in Wikipedia. The 
least common construction is the one that is most common in past constructions 
without modal verbs, i.e. the IPP constructions (aux,-pres,-inf,). They account for 
less than 0.3% of the constructions in both corpora. An example with mag ‘may’ 
was given in (50). 

As the modals hoef ‘need to’ and behoort ‘ought to’ do not have a preterite 
form, we would expect them to occur as IPP verbs, but instead they consistently 
occur in pres,-pp,-aux, constructions. An example with hoef is given in (56). 


24 Some of the ambiguous verbs such as probeer ‘try’, hanteer ‘handle’ and fouteer ‘error’ are 
used as past participles without ge- in present-day Afrikaans and mentioned as such in the main 
prescriptive spelling dictionary (Van Huyssteen et al. 2017) and also for the first time in the latest 
edition of the most commonly used monolingual dictionary (Luther (ed.) 2015), although not 
everyone accepts it as grammatical. 
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(56) Ekis trots daarop dat ek nog nooit in my lewe opiemand anders 
I am proud there-off that I yet never in my life on someone else 
hoef te gereken het nie. 
need.INF to count.PSP have not 
‘Tam proud of the fact that never in my life have I had to count on anyone 
else.’ 


Also in the constructions with a present modal there are a number of ambiguous 
cases (pres-inf/psp-aux). They most likely belong to the pres,-psp,-aux, category. 

Summing up, the corpus results show that the canonical pret -inf, construc- 
tion is by far the most popular to express the past, followed by the pret,-pp,-aux, 
construction. Considering the alternative constructions with auxiliary het, it is 
clear that constructions in which the modal verb appears as the finite verb are 
preferred over constructions with an infinitival form of the modal (selected by 
auxiliary het). In combination with the fact that the use of the preterite form of the 
modal is preferred over the present form, IPP ends up as the least preferred option 
to express the past in these cases. 

It would be interesting to investigate the factors that trigger the use of the 
different alternative past constructions, but this is left for future work. 

Our findings confirm Robbers’ (1997) claim that the most common construction 
is the canonical one, and that the instances in which the auxiliary het ‘have’ is the 
finite verb hardly occur. 


3.5 Differences between the two corpora 


If we compare the proportions of the constructions described in section 3.1 to 3.3, 
we see that for most verbs the proportion of IPP to no IPP is similar in both corpora. 
However, if we take another look at Table 3, we see that the ratio of IPP construc- 
tions is 105.2 occurrences per million words in the Taalkommissie corpus and 
1478 occurrences per million words in Wikipedia. A chi-square goodness of fit test 
reveals that there is a statistically significant difference with respect to IPP in the 
two corpora.” 

Also on the level of the individual verbs there are some noteworthy differences 
between the two corpora. Only the Taalkommissie corpus contains examples with 


25 The y? value is 169.606. The P-value is < 0.001. The result is significant at p < 0.01. 


IPP in Afrikaans: a corpus-based investigation — 135 


a participle in the case of kom ‘come’, bly ‘stay’, ophou ‘stop’, probeer ‘try’, durf 
‘dare’, laat ‘let’, maak ‘make’, and leer “teach’.?® 

With respect to pseudo-coordination we observe that Wikipedia has signifi- 
cantly fewer constructions with pseudo-coordination than the Taalkommissie 
corpus.?? 

The differences between the corpora are most likely due to the inclusion of 
nonstandard Afrikaans in the Taalkommissie corpus, but as we do not have any 
metadata directly linked to the sentences in the corpus, it is hard to investigate 
this in a systematic way. 


3.6 Conclusion 


The corpus study has revealed some interesting insights with respect to the occur- 
rence of IPP in Afrikaans. With respect to the double infinitive constructions, we 
can conclude that verbs that may appear in such constructions are obligatory IPP 
verbs for most speakers. Within this set, the aspectual verbs turn out to be the 
most common IPP verbs. A subset of the verbs that appear in double infinitive 
construction may also occur as IPPs in passive constructions. 

With respect to constructions with pseudo-coordination, the corpus results 
confirm the statement in the literature that IPP is optional. 

Modal verbs have a special status. For those verbs IPP seems the least preferred 
option in comparison to alternative ways to express the past. Instead, a construc- 
tion in which they appear as a preterite finite verb selecting a bare infinitive is 
canonical. 


4 Integrating Afrikaans in the typology 
ofIPP verbs 


As already observed in the introduction, Afrikaans shares the IPP phenomenon 
with German and Dutch. For a study of the similarities and differences with these 


26 For each verb in Table 3, a Fisher’s exact test (Fisher 1922) was performed, as it is able to deal 
with small sample sizes, but only for leer ‘learn’ was the difference statistically significant (the 
Fisher exact test statistic value is 0.0311; the result is significant at p < 0.05). 

27 Again, a goodness of fit test was performed. The y? value is 19.846. The P-value is < 0.001. The 
result is significant at p < 0.01. 
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languages we employ the typology of IPP verbs in Augustinus/Van Eynde (2017). 
This typology is based on a usage-based analysis of Dutch and German tree- 
banks, i.e. Lassy Small and CGN core for Dutch and Tüba-D/S and Tüba-D/Z for 
German. For Afrikaans the corpus search yielded 8,122 examples with IPP, see 
Table 8 for an overview. Comparing this to the data from Dutch and German IPP, 
it is immediately clear that the phenomenon is less frequent in Afrikaans, see 
Table 9. 


Table 8: IPPs in the Taalkommissie corpus and Wikipedia 


Construction Taalkommissie corpus Wikipedia Total 
Double infinitive 5,681 1,951 7,632 
Pseudo-coordination 124 5 129 
Passive 237 116 353 
Modal 6 2 8 
Total 6,048 2,074 8,122 


Table 9: IPP per 10,000 words 


IPPs/words in corpus IPPs per 10K words 
Dutch 1,101/ 1,983,788 5.55 
German 400 / 1,468,415 2.72 
Afrikaans 8,122 / 67,088,350 1.21 


To some extent this is due to the fact that the numbers for Afrikaans are a slight 
underestimation given that we only consider adjacent verbs and verbs which are 
separated by at most one other word. The main factor, though, is the very low 
frequency of IPP forms for the modals. While they are the most frequently used 
IPP verbs in the Dutch and German treebanks, the Afrikaans corpora contain just 
a few examples, see (54). Ifthe modals are left out ofthe comparison, Afrikaans 
takes an intermediate position, having fewer IPP forms than Dutch, but more 
than German. 

A finer-grained comparison can be made on the basis of the typology in 
Figure 1. 
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IPP VERBS 


ee ren 


SUBJECT-ORIENTED OBJECT-ORIENTED 


MODAL ASPECTUAL SUBJECT-CONTROL CAUSATIVE PERCEPTION BENEFACTIVE 


ne ce O 


CORE MODAL EVIDENTIAL CORE ASPECTUAL SEMI-ASPECTUAL NL: 9.90% NL: 14.53% NL: 5.0% NL: 0.18% 


| | | | DE: 9.25% DE: 22.5% DE: 1.5% DE: 0 
AF: 10.22% AF: 27.54% AF: 1.76% AF: 2.27% 


NL: 40.14% NL:0 NL: 24.43% NL: 5.81% 
DE: 66.75% DE:- DE: - DE: - 
AF: 0.10% AF: 0.07% AF: 56.45% AF: 1.59% 


Fig. 1: A typology of IPP in Dutch, German and Afrikaans 


The percentages for Dutch and German are quoted from Augustinus/Van Eynde 
(2017). They concern the relative frequency in the respective treebanks. The per- 
centages for Afrikaans are based on the corpus study in section 3. 

Within the class of the modal verbs we make a distinction between core modals 
and evidential modals. The former are, as discussed previously, the most frequently 
used IPPs in Dutch and German, but they are nearly absent in Afrikaans. The few 
exceptions include some hits for mag ‘may’, wil ‘want’ and kan ‘can’. The evidential 
modals are rarely combined with a perfect auxiliary in any of the three languages. 
If they do, they occasionally take the IPP form in Afrikaans, as in (40). This is also 
possible in Dutch, but in order to attest it Augustinus/Van Eynde (2017) had to 
resort to internet search, since the treebanks did not contain any hits. 

Within the class of the aspectual verbs we make a distinction between the core 
aspectuals and the semi-aspectuals.?® The distinction is relevant for Afrikaans, 
since the former turn up in the double infinitive construction while the latter turn 
up in pseudo-coordination. It is also relevant for Dutch, since the core aspectuals 
take zijn ‘be’ as the auxiliary of the perfect, while the semi-aspectuals take hebben 
‘have’, cf. Ik ben gaan lopen ‘I went and ran’ versus Ik heb staan praten ‘I was 
talking’. The odd one out in this category is German. While the aspectuals account 
for more than half of the IPP verbs in Afrikaans and for nearly 30% of the IPP 
verbs in Dutch, the German aspectual verbs do not take the IPP form. 

The subject control IPP verbs are a heterogeneous group. There is one that 
occurs in all three of the languages, i.e. leer/leren/lernen ‘learn’, and there are 
three that occur in Afrikaans and Dutch, but not in German, i.e. probeer/proberen 
‘try’, durf/durven ‘dare’ and weet/weten in the sense of ‘manage to’. 


28 The term ‘semi-aspectual’ is adopted from Broekhuis/Corver (2015: 151). 
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The causative IPP verbs are few in number but relatively high in frequency: 
laat and maak jointly account for 27.54% of the Afrikaans IPP verbs, lassen for 
22.5% of the German IPP verbs, and laten and doen for 14.53% of the Dutch IPP 
verbs. 

The perception IPP verbs are the same in the three languages, i.e. sien/zien/ 
sehen ‘see’ and hoor/horen/hören ‘hear’. Examples with voel/voelen ‘feel’ can be 
constructed and/or googled, but are not attested in the corpora that we consulted. 

The benefactive IPP verbs are help/helpen/hilfen ‘help’ and leer/leren in the 
sense of ‘teach’. They account for a very small percentage of the IPP verbs. For 
German we did not find a single hit in the treebanks. 

In contrast to Dutch and German, Afrikaans also allows the use of IPPs in 
combination with the auxiliaries of the passive, i.e. word and wees. The instances 
that we found in the corpora mainly concern the causative laat (226 hits), the 
aspectual begin (111 hits) and the control verb probeer (14 hits). 

The resulting classification can be compared to the one in Schmid (2005), 
which also covers Dutch, German and Afrikaans, as well as some regional variants 
of Dutch and German, i.e. Bernese German, Sankt Gallen German, Zürich German 
and West Flemish. Schmid distinguishes eight classes of IPP verbs, just as we 
do, but the details of the classification are different for the subject-oriented ones. 
Within the class of aspectuals, she differentiates between durative and inchoative, 
which in our classification are both core aspectuals, and she does not include the 
semi-aspectuals, which is justified for German, but not for Dutch or Afrikaans. The 
evidential modals are called raising verbs in Schmid (2005), which is a misnomer, 
since the modals and most of the aspectuals are raising verbs as well. 

As shown in Augustinus/Van Eynde (2017), Schmid’s claims about the status 
of IPP for the eight classes (obligatory, optional or impossible) is in line with the 
judgments in the literature and with corpus data for German, except in the case 
of the causative lassen ‘let’, which she classifies as an obligatory IPP verb, while 
it is in fact an optional IPP verb, as claimed in Duden (2006) and confirmed by the 
corpus data discussed in Augustinus/Van Eynde (2017). For Afrikaans, Schmid 
claims that IPP is optional for all classes, except for the raising verbs, which 
according to her do not allow it. This is contradicted by the fact that blyk ‘seem’ 
does occur as IPP in the corpora we consulted. The characterization of the other 
classes as optional IPP verbs is not by itself erroneous, but it is misleading since 
it does not differentiate between classes where the IPP form is by far the most 
frequent (the double infinitive construction), classes where it is more or less in 
50-50 relation with the non-IPP form (pseudo-coordination) and classes where 
it is marginal (the modals). 
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5 Conclusion 


We conducted a corpus study of two Afrikaans corpora in order to identify the 
verbs showing the IPP effect. The corpus results reveal that the majority of the IPP 
constructions concern instances of the double infinitive construction. In addition, 
Afrikaans IPPs also occur in constructions with pseudo-coordination and in 
passive constructions. 

Next, we compared the results to a previous corpus study on IPP in German 
and Dutch. IPP is more widespread in Afrikaans than in Dutch and German, as in 
those languages IPP is only possible in the double infinitive. In order to compare 
the verb types that figure in IPP constructions, we extended the classification in 
Augustinus/Van Eynde (2017) on Dutch and German IPP to Afrikaans. With respect 
to the verb types that may appear as IPP, Afrikaans resembles Dutch, as the IPP 
shows up in the same verbal categories (i.e. modal, aspectual, subject control, 
causative, perception and benefactive). The relative frequencies of the corpus study 
indicate, however, that the two languages highly differ with respect to the category 
of core modal verbs. While this is the prototypical IPP category in Dutch, Afrikaans 
modals hardly show the IPP effect. Instead they canonically use an alternative 
construction to express the past. The prototypical IPP category in Afrikaans is the 
set of aspectual verbs. If we compare Afrikaans to German, we see that German 
aspectual verbs never occur as IPPs. 

In future work we aim to create a large treebank for Afrikaans. Due to the 
limitations of the corpora, we had to limit our queries to constructions in which 
the verbs were (almost) adjacent. Using a (large scale) treebank would allow the 
investigation of constructions with non-adjacent words in a more systematic way. 
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Part 2: Diachronic Perspectives 


Mirjam Schmuck (Mainz) 

The grammaticalisation of definite articles 
in German, Dutch, and English: a micro- 
typological approach 


Abstract: Sharing definite articles as a common feature, Germanic languages, 
however, diverge considerably with respect to these articles’ functional domains. 
Restrictions concern generic uses on the one hand and combinations with proper 
names on the other, displaying both later stages in grammaticalisation. Taking 
three West Germanic languages into account, German, Dutch, and English, it is 
shown that the semantic-pragmatic extension proceeds along the hierarchy defi- 
nite > generic > onymic with the spread singular > plural generics and non-proto- 
typical > prototypical proper names (i.e. with/without appellative heads) as inter- 
mediate steps. It will be argued that this development is most advanced in 
German, where both the generic and the onymic article are extensively used, 
which is not the case in English. Allowing for both the generic and the onymic 
article but with restrictions, definite articles in Dutch represent an intermediate 
stage of functional expansion. 


Zusammenfassung: Aus einem Demonstrativ entstandene Definitartikel sind ein 
gemeinsames Charakteristikum der germanischen Sprachen. Trotz des gleichen 
Entstehungsweges bestehen klare funktionale Unterschiede; insbesondere diver- 
giert der Gebrauch in generischen Kontexten und die Kombinierbarkeit mit in- 
härent definiten Eigennamen. Der Beitrag fokussiert, aus mikro-typologischer 
Perspektive, die funktionale Extension des Definitartikels in drei west-germani- 
schen Sprachen (Deutsch, Niederländisch und Englisch). Es wird gezeigt, dass 
die Grammatikalisierung der Hierarchie Demonstrativ > Definitartikel > generi- 
scher Artikel > onymischer Artikel folgt, mit der Expansion Singular > Plural 
(Generika) sowie nicht prototypische > prototypische Eigennamen (d.h. mit/ohne 
appellativischem/n Kopf) als Zwischenetappen. Diese Entwicklung ist im Deut- 
schen am weitesten fortgeschritten, gefolgt vom Niederländischen. Die meisten 
Restriktionen gelten für den Definitartikel im Englischen. 


ð Open Access. © 2020 Schmuck, published by De Gruyter. JEMAAT This work is licensed under the 
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. 
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1 Introduction 


Definite articles are a common Central European feature. However, the fact that 
their functional domains diverge considerably even within Germanic languages 
became evident during a presidential debate in 2016, when Donald Trump pro- 
voked a discussion on racism only because of his usage of the definite article in 
English which would probably have gone unnoticed in German: ' 

The news website QARTZ (https://qz.com/) comments as follows in a gloss 
from October 11, 2016:? 


Linguistics explains why Trump sounds racist when he says “the” African 
Americans 

One of the littlest words in the English language gives the biggest clue about where 
Donald Trump’s head is at: his use ofthe word “the.” 

In the second US presidential debate on Oct. 9, Trump promised, “I’m going to help 
the African-Americans. I’m going to help the Latinos, Hispanics. I’m going to help the 
inner cities. [Clinton has] done a terrible job for the African-Americans.” [...] 

If Trump had said, “I’m going to help African-Americans,” we’d assume he meant 
African-Americans in general - whichever ones need help. Under normal circum- 
stances, saying “the African-Americans” would raise the question: Which African- 
Americans? [...] he intends to refer to all African-Americans, and so “the” seems 
unnecessary. But it is doing something. It takes that plural, “African-Americans,” and 
makes the group into more of an undifferentiated whole. [...] “The” makes the group 
seem like it’s a large, uniform mass, rather than a diverse group of individuals. This is 
the key to “othering:” treating people from another group as less human than one’s 
own group. 

Trump’s “the” works as a dog-whistle to disaffected rural white voters attracted to his 
message. [...] 


A woman with African-American roots reacted as follows: 


I REALLY need him to stop calling me “THE African-Americans” 
because ARE YOU KIDDING ME? 


Definite articles represent a characteristic feature of (western) European, mainly 
Germanic, Romance, and Balkan languages. They may also be described as a 


1 I wish to thank two anonymous reviewers for their insights and criticisms, which helped 
substantially improve the quality of this article. 

2 Linkto the article: https: //qz.com/806174/second-presidential-debate-linguistics-explains-why- 
donald-trump-sounds-racist-when-he-says-the-african-americans/ (29-3-2018). Many thanks to 
Damaris Niibling for drawing my attention to this article. 
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EuroversaP (e.g. Haspelmath 1998, 2001; Heine/Kuteva 2006). According to Dryer 
(1989), only a third of the languages of the world employ definite articles (125 out 
of about 400 languages) and less than 8% (31 languages) use both definite and 
indefinite articles. With respect to the worldwide sample, the European language 
area stands out typologically in that up to 39% of European languages have defi- 
nite and indefinite articles at their disposal, and an additional 15% have devel- 
oped definite articles only (cf. Heine/Kuteva 2006).* 

The development of definite articles from demonstrative determiners has 
frequently been the focus of typological research (cf. Greenberg 1978; Lehmann 
1995; Lyons 1999: 331-334; Heine/Kuteva 2006: 97ff.). In the framework of gram- 
maticalisation theory special attention has been paid to the semantic-pragmatic 
expansion into new functional domains, which has been described as a gradual 
spread in a predictable manner. In his essential work, Greenberg (1978) distin- 
guishes three main diachronic stages of development (cf. also Hawkins 2004; 
Heine/Kuteva 2006; de Mulder/Carlier 2011), as depicted in Figure 1. 


(DEMONSTRATIVE) > DEFINITE ARTICLE > SPECIFIC ARTICLE > NOUN MARKER 


Fig. 1: Stages of grammaticalisation according to Greenberg (1978) 


At Stage I the former demonstrative “becomes compulsory and has spread to 
the point at which it means ‘identified’ in general” (Greenberg 1978: 61). Unlike 
(discourse-)deictic demonstratives that are restricted to situational or anaphoric 
use (e.g. this/the book over there, I bought a book - this/the book), definite articles 
typically denote entities that are not immediately given (i.e. non-situational). 
Instead, uniqueness in the discourse is based on a larger context such as a prior 
conversation (e.g. Did you watch the film, finally?), general knowledge (the presi- 
dent referring to the current president) or “stereotypic ‘frames’ (a house : the 
door)” (Hawkins 2004: 85, cf. also Himmelmann 1997: 93-101 on anamnestic 
determiners). In the terms of Hawkins, definite articles must be unique within the 
“pragmatic set” (or “P-set”) shared between speaker and hearer (cf. Hawkins 
1978, 1991; cf. also Lyons 1980). Representing Stage II in Greenberg’s hierarchy, 


3 The term Euroversal was introduced by Kortmann (1997: 271-288), who focuses on morpho- 
syntactic properties of adverbial subordinators. 

4 More marginal article systems attested in European languages that deviate from the common 
western European type are described in Schroeder (2006), e.g. systems with two definite articles 
(northern European) and definite articles going back to possessive suffixes (eastern European); 
on northern European cf. also Dahl (2004). 
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specific or “non-generic” articles include “non-definite specific uses” (Greenberg 
1978: 62). Stage II articles denote individual referents that are not yet part of the 
shared pragmatic set of speaker and hearer. They are typically used when partici- 
pants are newly introduced in the discourse with “instances of non-referential 
use” also being included (e.g. I eat *the/an apple a day) (Greenberg 1978: 62; cf. 
also Himmelmann 1997: 101-109 and König 2018 for a critical discussion of specific 
articles as a further development of definite articles). Specific articles are char- 
acteristic of Niger-Congo and Austronesian languages (cf. Greenberg 1978: 62-68; 
Himmelmann 2001) but - with the exception of a Northern Swedish dialect? (cf. 
Dahl 2004) - they are not attested in European languages, where the indefinite 
article takes over (e.g. EN I’ve got *the cat/a cat, I need *the/a new car; cf. Heine/ 
Kuteva 2006). Articles that developed into mere noun or gender markers repre- 
sent the last step (Stage II) in Greenberg’s cline, which implies that “the mass of 
common nouns now only have a single form” and that the article “no longer has 
any synchronic connection with definiteness or specificity” (Greenberg 1978: 
69). Stage III-articles are not attested in European languages either. Nominalisa- 
tion constitutes, of course, a central function of articles particularly in English 
(to run - the run, green - the green). Still, in the languages investigated, the defi- 
nite article requires an individual referent that is identifiable by the hearer and 
it stands in opposition to the indefinite article (a/the run, a/the bright green). 
Figure 2 takes up Stages I-III according to Greenberg (1978) and provides English 
examples. 

Considering different language families and types, mainly African but also 
Austronesian and Australian languages, the three stages distinguished within 
this “macro”-typological perspective appear less appropriate to describe gram- 
maticalisation in the western European languages. Belonging to a small set of 
languages worldwide that have developed both definite and indefinite articles 
(less than 8% in Dryer 1989’s sample), European languages stand out in that the 
indefinite article takes over in the domain of specific reference (Stage II) (cf. also 


5 As shown in Dahl (2004: 172f.), in a Northern Swedish dialect, the suffixed definite article 
occurs with referents that are specific but not identifiable by the hearer. The following example 
is provided: 

Skelleftea (Vasterbotten) 

Hä gick skaplit att klaar sä, 

it go:PST okay to survive 

meda ä fanns rått-än å mus-än 

as longas it exist:PST rat-DEF.PL and mouse-DEF.PL 

‘[The cat thinks:] It was kind of OK [to live in the forest], as long as there were rats and 

mice...’ 
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Harris 1980: 81f.). Thus, for the European sprachbund and for the closely related 
West Germanic languages in particular, where uses typical of Greenberg’s Stage 
II/IN-articles hardly ever occur, a more fine-grained scale is needed. 


referential non-referential 
Stage 0 Stage | Stage Il Stage Ill 
DEMONSTRATIVE DEFINITE ARTICLE SPECIFIC ARTICLE NOUN MARKER 
The book over there is Did you buy the book, i Pve got *the cat. 

mine. finally? i: >Pvegota cat. 


Fig. 2: Greenberg’s cline applied to Germanic languages (English) 


Lyons (1999) compares the functional domains of definite articles in four Euro- 
pean languages (English, French, Italian, and Greek) from a synchronic perspec- 
tive. The following picture emerges, see Figure 3: 


1 (English): simple definite® 

2 (French): simple definite, generic 

3 (Italian): simple definite, generic, possessive 

4 (Greek): simple definite, generic, possessive, proper noun 


Fig. 3: Expansion in article use according to Lyons (1999: 337) 


Of course, the purely synchronic data from only four languages have to be inter- 
preted with caution. The figure is explicitly not intended as a universal implica- 
tional scale. Still, the data reflect the following diachronic evolution in the lan- 
guages considered: 


simple definite > generic > (possessive) > proper noun 


Although building on a small set of data only, this scale appears to be a promising 
starting point for the present purpose, all the more so since proper names are 
also taken into account. However, possessives will be discarded in the following 
analysis. They have to be the subject of future research. 


6 “Simple definite” articles correspond to Stage I-articles in the terms of Greenberg. 
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The development demonstrative > definite article is most straightforward and 
well documented (cf. Himmelmann 1997, 2001; Lehmann 1995; Szczepaniak 2011: 
71-78). Less attention, in particular from a cross-linguistic perspective, has been 
paid so far to later stages in the grammaticalisation of articles, namely the gradual 
expansion to generic uses and, most notably, to proper names - a function gener- 
ally subsumed under the so-called onymic article. Focusing on three closely 
related West Germanic languages, namely German, Dutch, and English, it will be 
shown that substantial differences exist with respect to the functional expansion 
of definite articles proposing the following scale, see Figure 4. 


English: simple definite 
Dutch: simple definite generic 
German: simple definite generic proper noun 


Fig. 4: Functional expansion in English, Dutch, and German (hypothetical) 


The next section provides a short overview of the diachronic evolution by summa- 
rising the relevant research for German and English; for Dutch, diachronic data 
are unfortunately scarce. 


2 Stages of grammaticalisation: 
Diachronic overview 


2.1 Diachronic overview: German 


In German, definite articles are missing in the earliest texts, but the demonstrative 
determiner grammaticalises in the course of the Old High German (OHG) period 
(750-1050) towards an incipient Stage I-article. The development from demonstra- 
tive determiner to definite article in Old High German (Stage 0 > Stage I-article) 
has been the subject of several corpus-based studies (cf. Oubouzar 1992, 1997; 
Leiss 2000; Szczepaniak/Flick 2015; Flick 2017). In early Old High German, the 
determiner ther is restricted to discourse deictic functions and has to be interpreted 
as a demonstrative. For its transition to the definite article, anamnestic uses are 
considered as the starting point (cf. Himmelmann 1997: 93-101). At this point, 
definite reference is no longer based on the concrete situation or discourse context 
but on shared cultural or religious knowledge (e.g. OHG diu magd ‘the virgin’ for 
virgin Mary, diu skirft ‘the writing’ for the bible) (cf. Szczepaniak 2011: 71-73). From 
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the 9th century onwards, non-situational uses are increasingly attested and OHG 
ther gradually evolves into a marker of definiteness. Being initially restricted to 
human referents, it expands along the animacy hierarchy human > animate > 
concrete > abstract in the course of the OHG period (cf. Szczepaniak 2011: 74f.). 
The first instances of definite articles in generic contexts are attested as early as 
in the 9th century, in the OHG Tatian translation (ther man ‘the human’) but occur 
more regularly only from the 10th/11th century onwards. Selecting no individual 
entity but the class as a whole (e.g. the species man), generic articles referring to 
kinds pave the way for the expansion to indefinite or non-referential contexts 
(cf. chapter 3). As a last step, about 500 years later, from the late Early New High 
German period onwards (1350-1650), the definite article starts to spread to per- 
sonal names in German vernaculars (16th century) (cf. chapter 4.1), where, similar 
to unique nouns (e.g. the moon), the article functions as a mere expletive marker 
(on the notion of an expletive article see Longobardi 1994; Gallmann 1997; Sturm 
2005: 114-120; but Karnowski/Pafel 2005 for an opposing view). Figure 5 provides 
an overview of the diachronic development (cf. also Szczepaniak 2011: 78). 


DEMONSTRATIVE > DEFINITE > GENERIC > ONYMIC 
anaphoric anamnestic individual expletive generic 
een names 
gisah einan figboum dhiu magad ther cuning ther mano ther man der Daniel 
[...] then figboum ‘this/the virgin’ ‘the king’ ‘the moon’ ‘the human’ ‘the Daniel’ 
‘he saw a fig tree [...] (Virgin Mary) (colloquial) 
this fig tree’ 
OHG late OHG MHG NHG 
(9th c.) (10th/11th c.) (11th-14th c.) (17th c.) 


Fig. 5: Diachronic expansion of the definite article in German 


2.2 Diachronic overview: English 


The definite article in English has been the subject of numerous, mainly syn- 
chronic studies (e.g. Christophersen 1939; Hewson 1972; Carlson 1978; Hawkins 
1978; Chesterman 1991; Lyons 1991, 1999). The major steps in the diachronic 
evolution presented below follow Hodler (1954), Hewson (1972), and the Cam- 
bridge history of the English language (Hogg (ed.) 1992; Blake 1992; Lass (ed.) 
1999). For the early stages, the development from demonstrative to definite arti- 
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cle, the evolution parallels that observed in Old High German. In the Old Eng- 
lish period (500-1100), the fully inflecting OE determiner se (masc.), seo (fem.), 
pet (neut.) is still restricted to discourse-deictic uses and expands only gradu- 
ally to definite contexts. “In Old English many types of noun, and certain types 
of usage, show resistance to an article that has traces of demonstrative force” 
and seem “more like a mixture of demonstrative and article” (Hewson 1972: 18). 
In Middle English (1100-1500), when strong vs. weak adjectival inflection as an 
alternative marker of (in)definiteness got lost and the indefinite article emerged, 
the usage of articles comes to be further systematised in that a clear-cut distinc- 
tion is established between both deictic that (< OE pet) and non-deictic, invar- 
iant the (< OE se, seo) on the one hand and the definite (the) and the indefinite 
(a/an) article on the other hand (Blake 1992: 217). In Early Modern English 
(1500-1800), however, the definite article still tends to be left out in combina- 
tion with abstract nouns (cf. Lass (ed.) 1999: 191f.). Definite articles with a 
generic reading are found only occasionally during the Old English period (OE Se 
lareow scal bion on his weorcum healic ‘the/a teacher must excel in his works’; cf. 
Hogg (ed.) 1992: 176) but they occur more regularly by the end of the Middle Eng- 
lish period with kind-referring singular nouns of the type the cat loves comfort 
(cf. Lass (ed.) 1999: 191). Also in Modern English, singular definite generics are 
missing with abstract nouns (EN *the life is beautiful vs. GE das Leben ist schön) 
(cf. Wandruszka 1969: 190; Schaden 2012) and mass nouns (*the wine, *the rice) 
(cf. also chapter 3.1); definite plural generics are by and large restricted to nomi- 
nalised adjectives (the poor, the French) (Lyons 1991, 1999: 189-193, but cf. also 
chapter 3 for corpus data). In combination with unique nouns (Be heouene ‘the 
heaven’, be sonne ‘the sun’), definite articles can be found from the Middle Eng- 
lish period onwards, but they did not survive in all instances in Modern English 
(the sun, the moon but *the heaven, *the paradise, *the hell’). Further restrictions 
concern prepositional phrases and the onymic article. In English, other than in 
Modern German, the article is often missing in prepositional phrases, in particu- 
lar when reference is non-specific (EN she goes to church, he came after lunch vs. 
GE sie geht in die Kirche, er kam nach dem Mittagessen) (cf. Löbner 1985: 307). 
With only a few exceptions, mainly names of rivers (the Rhine), the onymic arti- 
cle is not available. Its use appears to be very restricted even if the name is pre- 
modified by an adjective (*the little Mary) (cf. chapter 4.2 for more details). For 
an overview see Figure 6. 


7 Pointing to the divergent behaviour of *the heaven vs. the sky, Wandruszka (1969: 193) goes 
even further and interprets the religious terms heaven, paradise, and hell as proper names. 
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To sum up, definite articles appear to be more restricted in use in English 
than in German. This holds true for their combinability with unique and abstract 
nouns and, most particularly, for generic contexts and in combination with proper 
names. The fact that the functional evolution is less advanced in English has 
already been pointed out by Curme (1922: 67): 


A difference of development or conception in some cases leads to a different use of the 
article in the two languages: [...] it becomes apparent that English has preserved much 
better than German the old simple form of the noun without the definite article wherever it 
represents a person or thing single in kind, like a proper name. 


Considering both the generic and the onymic article, in the following the func- 
tional expansion of the articles will be investigated in more detail. 


Demonstrative > Definite > Generic >  Onymic 
anaphoric individual expletive generic 
ie names 
beet cild weox and was se wulf be sonne the cat (loves *the Peter 
gestrangod ‘this/the wolf’ ‘the sun’ comfort) *the Mr. Smith 
‘this/that child (i.e 
Jesus Christ) grew and but: NE 
became strong’ *the heaven 
OE OE/ME (Early)ModE 
(9th c.) (11th-13th c.) (16th c.) 


Fig. 6: Diachronic expansion of the definite article in English 


3 Generic articles 


3.1 Prototype- vs. kind-referring generics 


As has been pointed out in the vast body of literature on generics, basically both 
the indefinite and the definite article are available for generic reference but both of 
them are associated with different generic types, cf. (1): 


(1) A tiger is striped. / *A tiger almost died out. 
Tigers are striped. / Tigers almost died out. 
The tiger almost died out. / ?The tiger is striped. 
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The indefinite singular article operates on the level of individuals and singles out 
a prototypical member of its kind (e.g. a tiger is striped).? For this type the term 
prototype-referring is introduced here. In the case of plural indefinites or bare 
plurals, all prototypical individuals are selected (Tigers are [generally] striped). 
Still, exceptions are allowed, which explains why plural indefinites are more 
widely applicable.’ The definite article, by contrast, is typically kind-referring 
and hence more restricted in use. Operating on a higher taxonomic level, it selects 
the class as a unit, e.g. the tiger, the lion etc. as definite uniform entities of the 
hyperonym mammals, and hardly allows for exceptions. Definite generics are the 
first choice in sentences with kind-level predicates (die out, be numerous), 
whereas indefinite generics are typically combined with individual-level predi- 
cates (be striped, be intelligent) (cf. Carlson 1978; Kratzer 1995; Barton/Kolb/ 
Kupisch 2015).'° The fact that definite articles and indefinite articles or bare plu- 
rals operate on different taxonomic levels can be illustrated by the following Ger- 
man examples taken from Laca (1992: 268), cf. (2): 


(2) Die Deutschen trinken im Durchschnitt 500 Millionen Liter 
The Germans drink an average of 500 million litres 
Bier pro Jahr 
beer per year 
‘Germans drink an average of 500 million litres beer per year’ (all together) 


?? Deutsche trinken im Durchschnitt 500 Millionen Liter 
Germans drink an average of 500 million litres 

Bier pro Jahr 

beer per year 

‘Germans drink an average of 500 million litres beer per year’ (each German) 


8 However, as Nickel (2012) pointed out, strong and weak generics have to be distinguished, in- 
volving majorities (e.g. tigers are striped) or indeed minorities (e.g. Dutchmen are good sailors). 
The first have also been referred to as definitional generics in the literature (cf. Krifka 2012). 

9 The exact status of bare plurals is, however, unclear and has been controversially discussed 
in the literature, cf. e.g. Krifka (2004) on “Bare NPs: Kind-referring, indefinites, both or neither?” 
10 In the literature, several terms have been introduced to refer to both types. Taking formal 
aspects (domains of definite vs. indefinite articles) into account, Gerstner/Krifka (1993) use the 
terms D-genericity vs. I-genericity (cf. also Platteau 1980 on “definite and indefinite generics”). 
Krifka et al. (1995: 2-3) distinguish between “kind-referring NPs” (The potato was first cultivated 
in South-America) and “generic sentences” (A potato contains vitamin C, amino acids, protein 
and thiamine.), cf. also Behrens (2000: 6-8) for a critical discussion. For the present purpose, 
however, the distinction prototype- vs. kind-referring seems most straightforward. 
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In the first case, the definite article evokes a collective reading ‘Germans all together 
drink about 500 million litres beer a year’, which is not that much.” The second 
example, however, implies a distributive reading and would require every single 
German to drink this quantity individually. 

Basically, four options exist for generic NPs, but not all of them are likewise 
available in all three languages considered. Whereas productive definite singular 
generics represent a common feature, this is not the case for plural generics. In 
English, Gerstner/Krifka (1993: 967) ascribe definite generics “a rather marginal 
status”. Likewise, Hawkins (2004: 85) states that 


German has gone further than English and regularly uses the definite article with generic 
plurals where English does not: er zieht den Rosen die Nelken vor (he prefers Def+Dat+Pl 
roses Def+Acc+Pl carnations) ‘he prefers carnations to roses’. He prefers the carnations to 
the roses in English suggests pragmatically identifiable sets of each. 


Even though far from being “regularly” used in all contexts as assumed by Hawkins, 
definite plural generics are indeed much more common in German compared to 
English, which will be shown in the next section. To start, Table 1 compares the 
inventories for generic reference in English, Dutch, and German following the 
respective standard reference grammars (Quirk et al. 1999; Duden 2016; ANS). 


Table 1: Availability of generic articles in English, Dutch, and German according to the standard 
grammars 


English Dutch German 
prototype- A tiger is striped. Een tijger is gestreept. Ein Tiger ist gestreift. 
ferri 
vee Tigers are striped. Tijgers zijn gestreept. Tiger sind gestreift. 


kind-referring The tiger is threatened Detijger dreigt uitte Der Tiger droht auszuster- 


with extinction. sterven. ben. 

*The tigers are *De tijgers dreigen uit ?Die Tiger drohen auszu- 
threatened with te sterven. sterben. 

extinction. 


In the singular, definite NPs are common and constitute the first choice for kind- 
reference — at least when combined with count nouns (the tiger, the potato). As 


11 In fact, Germans consumed 84.6 million hectolitre in 2018, which, on average, corresponds to 
102 litre each (cf. https://de.statista.com, last accessed: 7-10-2019). 
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for human referents (nationality terms), singular generic NPs are grammatical in 
German, but involve a contemptuous reading and are therefore rather disfavoured 
(cf. Duden 2016: 295, § 390). In combination with mass (rice) or abstract (love) 
nouns definite singular generics are effectively more restricted and indeed 
ungrammatical in English (*the rice, *the love). In Dutch, they are hardly accept- 
able for mass nouns (??de rijst), but regularly occurring with abstract nouns (de 
liefde). In German, both mass and abstract nouns take the definite article (der 
Reis, die Liebe). However, as a reflex of former anaphoric uses, generic articles 
are better accepted in the singular and in subject position — a restriction that is 
best described as a phenomenon of ‘persistence’ in the terms of Hopper (1990, 
1991). The spread to (generic) object NPs constitutes a characteristic feature of 
languages representing an advanced stage of article grammaticalisation, such as 
in French for example, cf. (3): 


(3) EN Rice was introduced in Europe in the 10th century. 

I like rice. 

DU ??De rijst was was geïntroduceerd in Europa in de 10de eeuw. 
Ik hou van rijst. 

GE Der Reis wurde im 10. Jahrhundert in Europa eingeführt. 
Ich mag Reis. 

FR Le riz a été introduit en Europe au 10iéme siècle. 
Jadore le riz. 


Further restrictions concern definite plural generics that are not available in Eng- 
lish so that bare plurals (indefinite plurals) are chosen instead. However, definite 
plurals are exceptionally allowed for nouns derived from adjectives (the poor, the 
rich), including nationality terms such as the French, the Chinese (cf. Lyons 1991). 
In those cases, the definite article first of all functions as a noun marker and is 
obligatorily used both in the singular and in the plural, which allows Lyons (1991: 
105) to conclude: “So we have two types of plural generic: the indefinite restricted 
to nouns, and the definite restricted to adjectives.” (Cf. also Lyons 1999: 181ff.) In 
the Dutch standard reference grammar the definite plural is marked as doubtful 
(“twijfelachtig”) (cf. ANS, section 14.3.2). What is more, unlike in French, where 
the definite article is the first choice for generic reference (cf. Laca 1992; Lyons 
1999: 51), in the other languages definite plural generics are restricted to human 
referents, mainly nationality terms (cf. chapter 3.2), cf. (4): 


(4) EN Books/Cats are my greatest passion. 
DU Boeken/Katten zijn mijn grootste passie. 
GE __Biicher/Katzen sind meine größte Leidenschaft. 
FR Les livres/Les chats sont ma meilleure passion. 
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In order to obtain a more complete picture of the use of definite articles in 
generic sentences in the three languages considered, a small corpus study has 
been conducted. The results will be presented in the following section. 


3.2 Generic articles: Corpus data 


Generics have been the subject of several corpus-based studies, most notably 
generics in English. Recent data on German and English, among others, are pro- 
vided in Behrens (2000); generics in Dutch have been extensively studied in 
Oosterhof (2008); plural generics in German by Barton/Kolb/Kupisch (2015) and, 
from a language learners’ perspective, in Kupisch/Barton (2013) and Kupisch/ 
Pierantozzi (2010). In her contrastive study on generics in German, English, French, 
Hungarian, and Greek based on the story “Le petit prince” and its translations, 
Behrens (2000: 23) concludes that in English 


the use of the definite article is significantly more weakly attested than in the other lan- 
guages. More precisely, the percentage of definite phrases in English both in the singular 
(11,24%) and in the plural (24,42%) is approximately twice as low as the percentage of 
definite phrases in the other languages. 


In addition, what is emphasised is the exceptional prominence of the bare plural 
in English. Barton/Kolb/Kupisch (2015) investigated the semantics of plural gener- 
ics in German on the basis of acceptability judgement tasks. Their results support 
the claim that in German, besides bare subjects with an acceptability of 99.5%, 
definite articles are also accepted in the plural (67.7%), in particular when referring 
to kinds (kind-level 84.9% vs. individual level: 61.9%). Also, taking sociolinguistic 
aspects into account, age has been identified as a decisive factor: Older partici- 
pants accepted significantly more definite plural generics than younger ones 
and, moreover, the preference for definite plural generics in kind-referring sen- 
tences declined with age. Based on modern Dutch corpora, spoken and written 
(INL-corpora, CONDIV-corpus)”, Oosterhof’s (2008) findings emphasise an equally 
strong correlation between kind-reference and the use of definite articles for Dutch. 
In combination with the kind predicate uitsterven/uitgestorven ‘die out/died out’, 
the percentage of definite articles amounts to 38% (singular definites) and 47% 


12 INL-corpora = Corpora of the Instituut voor Nederlandes Lexicologie (http://corpushedend 
aagsnederlands.inl.nl/) (29-3-2018); CONDIV-corpus = corpus of the project Lexicale variatie in 
het Standaardnederlands. Convergentie/divergentie en standaardisering/substandaardisering 
in Nederland en Vlaanderen. 
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(plural definites) respectively, whereas indefinite singular NPs (3%) and bare 
plurals (13%) are underrepresented (Oosterhof 2008: 79-82). Another study 
conducted on the basis of corpora and questionnaires focused on nationality 
(e.g. Italiaan ‘Italian’, Brazilaan ‘Brazilian’) and animal terms (e.g. orang oetan 
‘orang-utan’, zeehond ‘seal’) in generic sentences (both, however, erroneously 
classified as nationality names and animal names”). Considering also areal factors 
(Netherlands vs. Belgium), the collected data reveals, first, a higher overall per- 
centage of the generic article in the Flemish area (Belgium) and second, a signifi- 
cantly higher acceptance of singular definites in combination with animal terms 
(animal vs. nationality terms: NL 39% : 13%, BE 32% : 18%); conversely, plural 
definites occur more frequently with nationality terms (animal vs. nationality 
terms: NL: 4% : 21%, BE: 3% : 25%). Ranging between 48% and 64%, bare plurals 
are, however, preferred in all subsets of data (cf. Oosterhof 2008: 103). 

In order to directly compare the acceptance of the generic article in the three 
languages considered, data was collected from the COW-corpus,'* a web-based 
corpus with sub-corpora for English (ENCOW), German (DECOW), and Dutch 
(NLCOW) (cf. Schäfer/Bildhauer 2012; Schäfer 2015). A first query conducted 
concerned the items ‘citizen’, ‘reader’, and ‘voter’ all of which refer to the nature 
of citizens/readers/voters rather than to individuals and can therefore be classi- 
fied as kind-referring generics typically involving definite singular NPs. A second 
query conducted in all three corpora focused on nationality terms. Referring to 
prototypical individuals, they are typically associated with plural indefinites. 
To obtain a maximum of generic sentences and to limit at the same time the total 
number of hits, both queries were restricted to items followed by a finite form of ‘to 
be’ (e.g. ‘citizen is/citizens are’). In the present analysis, the first 200 hits of each 
query were taken into account and, in a second step, all sentences with a generic 


13 Undoubtedly, animal and nationality terms share some properties with proper names (e.g. 
some kind of name-giving act for newly discovered species). Still, both of them behave like 
typical count nouns grammatically, do not refer to individuals and, what is most striking, they 
clearly have a denotative meaning. Accordingly, unlike proper names, they are not applicable to 
any entity in the world, but the object requires certain properties for being successfully referred 
to as orang-utan for example. 

14 The COW-corpus (= Corpora from the Web) is a web-based corpus with linguistic annotation 
comprising texts from a wide range of genres (e.g. press, comments, interviews). The subcorpora 
used for the present study are composed as follows: DECOW14: 20,495,087,352 words/17,147,104 
(Austrian, German, and Swiss German) documents; NLCOW 6,887,226,290 words/5,468,755 
(Dutch and Flemish) documents; ENCOW14: 16,821,840,292 words/9,216,176 documents repre- 
senting ‘World Englishes’ (see also http://corporafromtheweb.org/ (last accessed: 3-7-2019) for 
more details). 


The grammaticalisation of definite articles in German, Dutch, and English — 159 


interpretation were extracted using the test criteria for genericity defined in 
Krifka et al. (1995). The results are presented in Figures 7-8: 


EN | 
nl 
DU 
GE | 
N 
0% 10% 20% 30% 40% 50% 60% 70% 
Bsg.def ”sg.indef. Mpl.def. dpl. indef. 
English Dutch German 


l. | The citizen is entitled to ... 


Il. |A citizen is entitled to ... Een burger is bevoegd ... Ein Bürger ist berechtigt ... 


Ill. | The citizens are entitled to ... | De burgers zijn bevoegd ... | Die Bürger sind berechtigt ... 


IV. | Citizens are entitled to ... Burgers zijn bevoegd ... Birger sind berechtigt ... 


Fig. 7: Generic article: Reference to kinds (n=401) 


According to the data presented in Figures 7-8, definite generics are most appro- 
priate in the singular when referring to homogenous groups (‘citizens’, ‘readers’, 
‘voters’) but significant differences can be observed with respect to the languages 
considered. Amounting up to 58% and 55% respectively in German and Dutch, 
definite singular NPs constitute the first choice when referring to kinds, which is 
clearly not the case for the English counterparts with a total of only 25% and with 
bare (indefinite) plurals being preferred instead (59%). Reference to kinds is also 
possible with definite plurals and provides, by and large, the same general pic- 
ture: Again, the highest ratio is attested for German (25%) followed by Dutch 
(20%) and finally English (10%). In combination with nationality terms denoting 
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a group ofindividuals a divergent picture emerges: Definite singulars are scarce 
(EN 0%, DU 1%=2x, GE 8%=12x) and, in the few cases attested, they imply a neg- 
ative, stereotyped reading, cf. (5). 


EN | 


GE | 


10% 30% 40% 50% 60% 70% 


0% 20% 80% 
E sg. def sg. indef. Mpl.def. Apl. indef. 
English Dutch German 


I. |*The Americ. is proud of ... 


*De Amerikaan is trots op ... 


Der Amerik. ist stolz auf ... 


An American is proud of... 


The Americ. are proud of... 


Een Amerik. is trots op ... 


De Amerik. zijn trots op ... 


Ein Amerik. ist stolz auf ... 


Die Amerik. sind stolz auf... 


Americans are proud of... Amerikanen zijn trots op ... 


Fig. 8: Generic article: Reference to prototypical individuals (nationality terms) (n=476) 


Amerikaner sind stolz auf... 


(5) DU We weten het allemaal heel zeker, de Amerikaan is oppervlakkig 
‘We know it all for sure, Americans are superficial.’ 
GE der Amerikaner ist kulturell eben sehr tief stehend 


‘Americans simply are culturally underneat’?® 


15 https://www.amerika.nl/amerika/reisgids/cliches-amerika/amerikanen-zijn-oppervlakkig/ 
(last accessed: 3-7-2019). 

16 http://nachtkritik.de/index.php?option=com_content%26task=view%26id=3417%26ltemid=40 
(last accessed: 14-9-2019). 
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Definite plurals, by contrast, are frequently attested and turn out to be the first 
choice in German (pl. def. 66%, pl. indef. 22%) and, what is surprising, also pre- 
dominate in Dutch (pl. def. 51%, pl. indef. 45%). In English, where again bare plu- 
rals represent by far the preferred option (68%), definite plurals achieve the lowest 
percentage, but still amount up to 32%.” Thus, definite plural NPs appear to be rather 
unmarked in German where they constitute the first choice (cf. also 67.7% acceptance 
in Barton et al. 2015). Their English counterparts, however, are bare plurals. In Dutch, 
definite and bare plurals are equally accepted, which may point to a transitional state, 
such that the definite plural is about to be established as the unmarked form. However, 
the Dutch data presented in this study diverge considerably from the data provided in 
Oosterhof (2008), where definite plurals of nationality terms are less frequently 
attested amounting only up to 21% (NL) or 25% (BE) respectively. 

With respect to a possible semantic load, a closer look at the data reveals that 
the proportions of definite plurals vary significantly for each research item. In Ger- 
man and Dutch, the total amount of definite articles is much higher when referring 
to ‘Americans’ and ‘Frenchmen’, whereas for ‘Europeans’ indefinite plurals take 
over (def. pl.: ‘Frenchmen’ GE 84%, DU 62%, ‘Americans’ GE 74%, DU 56%, ‘Euro- 
peans’ GE 45%, DU 27%). In English, definite plural generics achieve the highest 
percentage for ‘Germans’ (52%), but are less frequent for ‘Europeans’ (29%) and 
scarce for ‘Americans’ (10%).'? Thus, being rather avoided for self-reference, defi- 
nite generics predominantly refer to foreign nationalities (GE die Franzosen, die 
Amerikaner vs. (die) Europder; DU de Fransen, de Amerikanen vs. Europeanen; EN 
the Germans vs. Europeans, Americans). These findings point to the fact that, in all 
three languages considered, definite generics (partially) retained their emphatic 
character and may involve a contemptuous reading and the notion of “othering” 
- a connotation that predominates in English, but is also present in Dutch, cf. (6). 


(6) EN Butitis not against the Germans that we hold our primary grudge.’” 
DU De Amerikanen zijn verantwoordelijk voor de meeste oorlogen ter 
wereld. 


‘Americans are responsible for most of the wars worldwide.’””° 


17 Possible differences with respect to the different ‘Englishes’ have to be the subject of future 
research based on a larger set of data. 

18 For the English data, it has kept in mind, however, that several Englishes are included in ENCOW. 
19 http://home.comcast.net/~e09066/1944/44-01/TL26.html (last accessed: 14-9-2019). 

20 https://blog.thesilvermountain.nl/krachtenbundeling-edelmetaal-centrum-amsterdam/ 
comment-page-5/ (last accessed: 3-7-2019). 
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To sum up, whereas indefinite singulars and, most notably, bare plurals constitute 
typical candidates for generic interpretation, the use of definite articles appears 
to be more restricted in earlier stages of grammaticalisation, namely to kind- 
reference in the singular (the tiger, the teacher). In the plural, definite generics 
evoke a contemptuous reading when combined with nationality terms. In conjunc- 
tion with the bleaching of deictic force during the process of grammaticalisation, 
the taxonomic interpretation weakens and definite generics gradually expand to 
groups of individuals. As has been shown, this development is most advanced in 
German, whereas English represents the other extreme with deictic force and the 
notion of “othering” still being prevalent. 


4 Onymic articles 
4.1 Diachronic development 


4.1.1 German 


As has been shown in chapter 2, the onymic article marks an advanced stage in 
grammaticalisation. Its rise in combination with personal names in German has 
been the subject of recent research. Onymic articles emerge in the late Early New 
High German period (16th/17th century) and spread from the southeast to the 
north triggered by case as one crucial factor among others (cf. Behaghel 1923: 
52f.; Schmuck/Szczepaniak 2014; Schmuck forthc.). Disregarding name class, 
names combined with attributive adjectives exhibit the onymic article signifi- 
cantly earlier, obligatorily from the Middle High German period onwards and, 
what is remarkable, also unmodified names of rivers (Paul 1919: 180). For early 
attestations of the onymic article in combination with personal names and names 
of rivers cf. (7) and (8): 


(7) Song of Anno (end of 11th century): 
Von demi gezügi des stiphtis  Worti diu Semiramis / 
From the material ofthis building / fashioned the Semiramis / 
‘From the material of this building/Semiramis fashioned’ 
Die burchmura viereggehtich. 
The town wall quadrangular. 
‘The town wall quadrangular.’ 
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(8) Tannhäuser, 13th century 
Rome bi der Tiver lit, der Arne get für Pise 
Rome near the Tiber islocated, the Arno goes infrontof Pisa 
‘Rome is located near the Tiber, the Arno flows through Pisa’ 


Paris bi der Seine lit, diu Musel get für Metzen 
Paris near the Seine is located, the Moselle goes infront of Metz 
‘Paris is located near the Seine, the Moselle flows through Metz’ 


In combination with personal names, the onymic article is nowadays extensively 
used in southern German dialects including Swiss German and Austrian-Bavar- 
ian, where names combined with the onymic article in fact represent the 
unmarked option. It is rather its omission that is effectively felt as inconvenient 
(e.g. Nübling/Fahlbusch/Heuser 2015: 123-124). In the north, by contrast, it is 
the definite article that, depending on the context, may elicit a derogatory con- 
notation and has remained uncommon up to the present (Werth 2014). Yet, the 
unmarked use of the onymic article currently spreads in colloquial speech pro- 
ceeding from the south to the north with central German as a transitional area 
(cf. Bellmann 1990; Longobardi 1994: 653f.; Eichhoff 2000; Duden 2016: 301, 
8398; Werth 2015). 


4.1.2 Dutch 


Similar to their German counterparts, in Modern Dutch names combined with an 
attributive adjective obligatorily take the onymic article (de kleine Jan ‘the little 
Jan’) that, in southern (Flemish) dialects, appears regularly from the Middle 
Dutch period onwards (13th century) (cf. also van der Horst 2008, I: 843f.). Also 
in Middle Dutch, instances of bare, unmodified names combined with onymic 
articles are occasionally attested, cf. (9)-(11): 


(9)  Rijmbijbel, West-Flanders 1285: 
Maria, des mijnders IJacobs moeder 
Mary, the younger Jacob’s mother 
‘Mary, the mother of Jacob the Younger’s’ 


(10)  Rijmbijbel, West-Flanders 1285: 
Martha beclaghede die Magdalene 
Martha deplored the Magdalena 
‘Martha deplored Magdalena’ 
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(11) Van onser vrouwen gheslachte, East-Flanders 1290: 
Dat was ihesus dier marien kint 
That was Jesus the-GEN Mary-GEN child 
‘That was Jesus, Mary’s child’ 


Standard Dutch lacks the onymic article with unmodified personal names. How- 
ever, onymic articles are characteristic of Flemish dialects and a south-north 
decline of article use similar to the situation in the German-speaking area can be 
observed. The following examples are extracted from the The Dynamic Syntactic 
Atlas of the Dutch dialects"), cf. (12): 


(12) SAND sentence 286 

Herinneren jullie je nog dat we Jan op de markt gezien hebben? 

‘Do you still remember that we have seen Jan on the market?’ 

a. Brabant, b. Limburg 

a. Wette gelle nog da we de Jan op de met emme gezien? 

b. Wit ier nog dat ya d’r Jan óp dr maar zoge? 
Know you still that we the Jan on the market once saw / 

(have) seen? 


As a striking fact, onymic articles are far more common with names of men as 
pointed out by van Langendonck (2007: 158): “In Dutch (Flemish) dialects the 
article de ‘the’ is used before men’s names and sometimes before women’s names 
to express familiarity with respect to the name bearer, e.g., de Jan ‘the John’, de 
Marie ‘the Mary’.” 

According to van de Ven/Govaart (1917), the onymic article combined with 
personal names appears in Brabant already in the 17th century (de Jans, den 
Theum??) and at that time equally induces a familiarity reading (“iets familiaars”). 

Similar to German and English, names of rivers regularly require the onymic 
article in modern Dutch (de Rijn, de Moezel), with the first instances dating back 
to Middle Dutch (14th/15th century), cf. (13): 


21 The Dynamic Syntactic Atlas of the Dutch dialects (DynaSAND) is an on-line tool for dialect 
syntax research available at http: //www.meertens.knaw.nl/sand/ (last accessed: 3-7-2019). 

22 Similar to definite articles in combination with common nouns, the onymic article equally 
reflects the Flemish “accusativism” (cf. van Loon 1989), i.e. the phonologically conditioned alter- 
nation of de and den (i.e. former accusative article) with the latter being used when the noun/ 
name starts with a vowel or with one of the consonants h, b, d, t or r (e.g. de Jan vs. den Alex). I am 
grateful to Ann Marynissen and to an anonymous reviewer for drawing my attention to this 
aspect. 
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(13) Reinaerde van den vos 1401-1410: 
Tusschen dier Elve entjer Zomme 
Between the Elbe and=the Somme 


4.1.3 English 


In modern English, personal names are typically undetermined. Sporadic occur- 
rences of the onymic article with bare (unmodified) names concern honorific 
articles such as the Talbot, the Douglas (cf. Poutsma 1914: 570f.), with Donald 
Trump’s nickname the Donald representing a contemporary case in point. Cru- 
cially, the article is also absent with premodifiers (little Eric, poor Mary), even 
though certain adjectives exhibit the onymic article (the inimitable/ill-fated/ 
unfortunate John Smith). According to the literature, personal names remain 
undetermined if the adjective has an “emotive colouring” (old Mrs. Fletcher, poor 
Charles), whereas “[iJn a more formal rather stereotyped style, the adjective is 
placed between the and a personal name” (e.g. the inimitable Henry Higgins) 
(Quirk et al. 1999: 290; cf. also Huddleston/Pullum 2002: 519f.). In other words, 
the article is absent when restrictive, name-like adjectives are concerned (little, 
old) but employed with attributive adjectives indicating a spontaneous judgment 
of the speaker about the person in question. Historically, however, sporadic 
instances of the syntactically conditioned onymic article can be found in contexts 
where the article is missing in the modern language, cf. (14): 


(14) Geoffrey Chaucer: Canterbury tales, late 14th century 
Wel knew he the olde Esculapius, 
And Deyscorides, and eek Rufus, 
Old Ypocras [...] 


Names of rivers require the onymic article (the Thames, the Rhine), which is 
already attested in Middle English but has not yet become obligatory before the 
Early Modern English period, thus differing from the situation in Middle Dutch 
and Middle High German (cf. Hewson 1972: 18-20), cf. (15)—(16):”? 


23 Why names of rivers stand out for taking the onymic article so early (i.e. from the Middle Ages 
on) has remained unexplored. The fact that in English the noun phrase may also be interpreted 
as elliptic (the Hudson [river]/the [river] Hudson) does not provide a satisfactory answer as the 
article is missing in similar cases like *the [Mount] Kilimanjaro (cf. also Anderson 2007: 106f.). 
Neither does this explanation hold for Modern German (der *[Fluss] Rhein masc. ‘the Rhine’, die 
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(15) Ranulf Higden: Polychronicon, engl. translation 14th/15th century: 
wip be Reyne in be norp side, wip be Rone in be est 


(16) Bible, King James Version, 1611: 
And were baptized of him in Jordan 


In sum, the onymic article appears to be most established in German where the 
syntactically conditioned article is obligatorily used from the Middle High German 
period onwards disregarding name type and, what is more, the onymic article is 
currently about to be established in front of personal names. In Dutch, grammati- 
calisation is less advanced. Similar to German, premodified names take the 
onymic article obligatorily, whereas personal names remain undetermined - 
with the exception of names for men in southern dialects. Most restrictions have 
been observed in English, which only allows for the onymic article in combination 
with names of rivers and, exceptionally, with personal names when combined with 
non-restrictive adjectives. Comparing the divergent development in all three lan- 
guages, it is crucial that the functional expansion of onymic articles correlates with 
the retention or loss of inflectional categories: in modern German, on the one end 
of the continuum, three categories (case, gender, and number) are expressed. In 
modern Dutch, only two morphological categories are maintained (gender and 
number) and only two genders (common and neuter) are distinguished. In addi- 
tion, article inflection displays more syncretism. In English, on the other end of the 
spectrum, article inflection got completely lost during the Middle English period. 

The present chapter focused mainly on personal names. In the following, 
more name types (prototypical vs. non-prototypical names) will be taken into 
account in order to gain a more complete picture of the expansion of the onymic 
article in modern languages. 


4.2 Prototypical vs. non-prototypical names 


In order to define prototypical vs. non-prototypical names, the distinction between 
proper nouns (i.e. the lexical category of name) and proper names (i.e. definite 
NPs referring uniquely to one entity in the world) is crucial (cf. Schlücker/Acker- 


[?] Mosel fem. ‘the Moselle’) or for Modern Dutch (de *[rivier] Rijn/Moezel). Rather, the onymic 
article as marker of onymic gender has to be conceived of as a classifier in the sense of Nübling 
(2015, forthc.) (e.g. EN: Hudson [family name] vs. the Hudson [river name], GE Warnow [city 
name] vs. die Warnow [river name]). 
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mann 2017). Prototypical names (name classes) consist of proper nouns (Anna, 
London), whereas non-prototypical names (name classes) comprise definite NPs 
with appellative heads (the White House) - often premodifies by proper nouns 
(Buckingham Palace). 

Following the classification provided in Nübling/Fahlbusch/Heuser (2015: 
101-105), personal names and, to a lesser extent, also names of cities and coun- 
tries or continents represent prototypical names in that they make recourse to a 
specific inventory (proper nouns) thus being maximally distinct from common 
nouns. More recent name types are rather descriptive in nature, headed by com- 
mon nouns (descriptors) that indicate the class of objects, e.g. names of institu- 
tions (York University), buildings (Westminster Abbey), streets (Baker Street) or 
mountains (Mount Everest). These appellative heads enhance the use of onymic 
articles. Crucially, in English, names containing non-proprial heads also remain 
more often undetermined compared to German and Dutch, cf. (17)-(19): 


(17) GE Die Westminster Abbey wurde zwischen 1045 und 1065 erbaut. 
DU (De) Westminster Abbey is tussen 1045 en 1065 opgericht. 
EN Westminster Abbey was built between 1045 and 1065. 
But: The Empire State Building, The World Trade Center 


(18) GE Die Baker Street ist eine berühmte Straße in London. 
DU De Bakkersstraat is een korte straat in Amsterdam Centrum. 
EN Baker Street is a well-known street in the city of London. 


(19) GE Der Mount Everest ist der höchste Berg der Erde. 
DU De Mount Everest is de hoogste berg ter wereld. 
EN Mount Everest is Earth’s highest mountain. 


Making recourse to a specific inventory, names of countries and continents usually 
remain undetermined in all three languages, which exceptionally also holds true 
for premodified names in English (modern Brazil, ancient Rome).”“ Thus, with the 
onymic article being restricted to names of rivers and personal names combined 
with non-restrictive adjectives, English stands out by a very limited use of the 


24 The few exceptions concern country names in the plural (die Niederlande, die USA) or with 
feminine gender (die Ukraine, die Schweiz) and names that originally belonged to other name 
classes (der Libanon < range of mountains, der Kongo < river) (cf. Thieroff 2000). Nevertheless, 
the onymic article is increasingly omitted in order to adapt these younger names morphosyntacti- 
cally to the German system requiring a county name to be gender neuter and undetermined (Nü- 
bling 2015). Strikingly, the corresponding country names (may) also exhibit the onymic article in 
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onymic article (cf. also Quirk et al. 1999: 288-297”). The onymic article appears to 
be better established in German and Dutch, where it is obligatorily used with 
many non-prototypical name types, i.e. names of buildings and institutions or 
streets. What is more, disregarding name type, the onymic article is required 
obligatorily with adjectival premodifiers. German stands out by accepting onymic 
articles with personal names in informal speech in large parts of the German 


speaking area. See Figure 9 for an overview. 


prototypical 


— [1 


buildings, streets, rivers, countries, cities persons 

institutions mountains continents 

German 

die Paulskirche |die Goethestraße | Italien, Asien Mainz, Paris Anna, Jan, coll.: 

die Goetheschule | der Rhein, der K2 die Anna, der Jan 
das ferne Asien |dasalte Rom der große Jan 

Dutch 

de St. Nicolaas- | de Hoofdstraat Italié, Azié Amsterdam, Jan, Emma 

basiliek de Mount Everest Parijs (Flemish: de Jan) 


de Tower Bridge 


de Rijn, de 
Moezel 


Abbey 


the Rhine, 
the (River) 
Thames 


modern Brazil 


ancient Rome 


het zonnige het mooje Parijs | de kleine Jan 
Frankrijk 
English 
Tower Bridge, Madison Avenue, | Italy, France, London, Paris Anna, James 
Westminster Mount Everest Asia, Europe 


little James, but: 
the unfortunate 
James 


Fig. 9: Usage of onymic articles in German, Dutch, and English considering different name 


classes 


English (*(the) USA, (the) Sudan, (the) Ukraine) (cf. Quirk et al. 1999: 293). As prototypical English 
country names are all undetermined (England, France, Italy), the article is also likely to be omit- 
ted in the long run. 

25 For a corpus-based investigation of article usage with complex or “multi-word” names in 
English see Tse (2005). 
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Further aspects that have not yet been considered in the present study but that 
appear to be relevant for the languages investigated concern, on the one hand, 
obligatoriness or, more precisely, interchangeability of the definite article and the 
possessive pronoun (when referring to body parts, for example) and, on the other 
hand, formal reduction, i.e. the occurrence of clitic forms. The following section 
provides first observations in these respects. 


4.3 Obligatoriness: Possessive pronoun vs. definite article 


The use of possessive pronouns vs. definite articles completes the general pic- 
ture presented here according to which English represents an earlier stage in 
grammaticalisation. In combination with body parts, the possessive pronoun is 
preferred in modern English (He closed his eyes/?the eyes), which parallels the 
situation in Old High German where possessive pronouns likewise constitute 
the first choice. In Middle High German, both vary freely, whereas on the way to 
New High German the definite article has taken over. In modern Dutch, both 
options are still available, but the possessive pronoun is clearly preferred (Hij sluit 
zijn ogen/?de ogen). Thus, diachronic variation in German seems to be paralleled 
by synchronic variation in the three languages considered, cf. (20)-(21) (bold 
print is used for the preferred variant): 


(20) Diachronic variation: 


OHG: Tatian 
leimon teta hér mir ubar minu ougun 
balm gave (“did”) he tome on my eyes 
MHG: Herbort von Fritzlar: Liet von Troye (a), Konrad v. Würzburg: Tro- 
janerkrieg (b) 
a. sine ougen er vf hup 
his eyes he raised up 
b. sie reip diu ougen unde sprach 


she rubbed her eyes and spoke 
NHG 
er schloss seine / die Augen 
he closed his / the‘his’ eyes 


(21) Synchronic variation 
EN Heclosed his eyes/*the eyes. 
DU Hijsluit zijn ogen/?de ogen. 
GE Erschloss seine Augen/die Augen. 
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This first impression fits well into the general picture of article grammaticalisation, 
however, corpus data needs to be provided in future research. 


4.4 Formal reduction 


Formal reduction is known as a typical side effect of increasing semantic bleach- 
ing during the process of grammaticalisation, a notion often referred to as ‘form 
follows function’. Accordingly, clitic or suffixed definiteness markers are expected 
in languages representing later stages of grammaticalisation (cf. Lehmann 1995: 
59; van Gelderen 2007 on the definiteness cycle in Germanic). Coexisting with 
more recent free forms, suffixed articles represent a characteristic feature of 
Scandinavian languages (e.g. Swedish ett hus ‘a house’ vs. hus-et house-def. ‘the 
house’) (Dahl 2004; Askedal 2011). Among the three West Germanic languages 
considered, clitic articles are most prominent in German where masculine and 
neuter definite articles clitisise to the preposition (zum Arzt ‘to=the-Masc doctor’, 
durch’s Fenster ‘through=the-NEUT window’). With highly frequent, monosyllabic 
prepositions (e.g. in ‘in’, zu, ‘to’, an ‘on’), enclitic articles are obligatory and no 
longer interchangeable with their corresponding free forms without a change in 
meaning (Niibling 1992, 1998), cf. (22)-(24): 


(22) Sie ist im Kino. 
‘She is in=the cinema.’ 
Sie ist in dém Kino, das ich so gerne mag. 
‘She is in the cinema that I like so much.’ 


(23) Sie ist im Urlaub / Sie ist in *dem Urlaub. 
She is in=the vacation / She is in *the vacation. 
‘She is on vacation.’ 


24) Sie macht eine Ausbildung zur (> zu einer/*der) 
She makes a training to=the (>to.a/*the) 
Krankenschwester 
nurse 


‘She is trained as a nurse’ 


Whereas in (22) the clitic article may denote either a definite or an indefinite 
entity, in (23)-(24) only an indefinite/generic reading is possible, and the clitic is 
no longer interchangeable with the free form. Clitic articles are already attested for 
Old and Middle High German (cf. Waldenberger 2009), but are most extensively 
used in Early Modern German (cf. Christiansen 2016). In Dutch, clitics are charac- 
teristic of Middle Dutch where both proclitic and enclitic articles, attached to the 
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noun (dat/het > d, t: tkint the=child, dwater the=water) and attached to preposi- 
tions (int < in dat/het ‘in the-neut.’), respectively, are attested (cf. van Loey 1970: 
145f.; van der Horst 2008, I: 388f.).”° In Modern Dutch, enclitics (preposition= 
article) subsist in informal speech giving rise to head-marked prepositions similar 
to German (e.g. Ik zag haar in’t museum; cf. van Gelderen 2007). In English, the 
invariant definite article is not the object of formal reduction; clitic forms are not 
attested in modern (standard) English. In earlier stages, however, proclitics do 
appear, especially in Early Modern English texts and, archaically, also later where 
the is regularly reduced to th in front of vowels and h (th’ enemy, th’ hilt) and some 
consonants (th’ world, th’ miller) (cf. Poutsma 1914: 513f.; van Gelderen 2007). With 
enclitics being attested in Dutch and particularly in German but not in English, the 
observable formal reduction reflects, at first glance, the extent of functional 
expansion. Unlike their Scandinavian counterparts, restrictions in use are, how- 
ever, twofold - at least in standard varieties: syntactically to prepositional phrases 
(PP) and morphologically, to masculine/neuter (German) and neuter (Dutch) 
gender respectively, see Figure 10: 


free morpheme > clitic article > suffixed article 
SE 
English (NP, PP) 
Dutch (NP) ——————— Dutch (PP) 
German (NP) ————————————- German (PP) 

e.g. Swedish (NP, PP) 


Fig. 10: Formal reduction of definite articles 


5 Summary 


The investigation of definite articles in three West Germanic languages revealed 
that the grammaticalisation and functional expansion that occurred independently 
in each of them, still follows the same underlying hierarchy presented in Figure 11: 


26 The fact that definite articles occurred regularly as clitics in Middle Dutch explains the modern 
form of the neuter article het (instead of dat): Clitic t’ has erroneously been associated with the 
neuter pronoun het and, as a consequence, former dat was replaced by het on the way to modern 
Dutch (van Loey 1970: 145f.). 
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DEFINITE > GENERIC > ONYMIC 
animate > concrete > abstract singular > plural + premodifier > - premodifier 
kinds > prototype(s) - prototypical > + prototypical 


= 


functional expansion 


Fig. 11: Functional expansion of definite articles in Germanic 


In all three languages investigated, the definite article may denote individual 
referents that are part of the shared pragmatic set (e.g. general knowledge, a 
prior conversation) and are as such identifiable by the hearer and thus markers of 
definiteness in a strict sense (Stage I articles). Generic reference, as a next step in 
Lyons’ (1999) “hierarchy” of diachronic expansion, acts insofar as a first obstacle 
as, in all three languages, the generic article is well established only in the singu- 
lar when referring to kinds (the tiger/de tijger/der Tiger) but restricted in use with 
mass nouns, which usually remain undetermined in English and Dutch (*the 


riz/??de rijst — der Reis), see Figure 12. 


l Il : TT 
DEFINITE GENERIC : ONYMIC : 
animate > singular > : plural +premodified > :-premodified 
concrete > : - prototypical > :+ prototypical 
abstract : : 
English 
the woman the tiger : ?the citizens *the ancient Rome : the Rhine 
the book *the rice : ??the Americans | *the little Mary : *the Mount Everest 
the freedom in *the spring :*the books : *the Baker Street 

: : *the Mary 
Dutch 
de vrouw de tijger : de burgers het oude Rome :de Rijn 
het boek de rijst : de Amerikanen de kleine Jan : de Mount Everest 
de vrijdom in de lente :*de boeken [Zwarte Piet] : de Bakkerstraat 

: :??de Jan, *de Emma 
German 
die Frau der Tiger : die Bürger das alte Rom : der Rhein 
das Buch der Reis : die Amerikaner der kleine Jan : der Mount Everest 
die Freiheit im Frühling :*die Bücher : der Jan (coll.) 


Fig. 12: Stages in grammaticalisation of definite articles in English, Dutch, and German 
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Definite generics appear to be even more restricted in the plural for mainly two 
reasons. First, definite plural generics are principally only available for animate 
referents (die Bürger/die Amerikaner/*die Bücher, de burgers/?de Amerikanen/*de 
boeken). Second, being more strongly associated with kind-reference, definite 
(plural) generics (most common with nationality terms) tend to involve a negative 
reading and the notion of “othering” — a connotation which is most prevalent in 
English (the citizens/??the Americans), with bare plurals representing the unmarked 
option (??the African-Americans/African-Americans). 

Onymic articles, i.e. definite articles combined with inherently definite 
proper names, constitute the third and so far final stage of extension in the lan- 
guages considered. They establish first in combination with premodifiers for 
syntactical reasons. Syntactically conditioned onymic articles are already com- 
mon in Middle High German and in Middle Dutch, but until now absent in Eng- 
lish (das alte Rom/het oude Rome/*the ancient Rome). For the spread of the 
onymic article to unmodified names, the presence/absence of appellative heads 
turned out to be crucial in that rather descriptive names and names with an 
appellative basic level term (e.g. names of streets and institutions) take the 
onymic article more readily than prototypical names (e.g. names of persons and 
cities) (e.g. DU de Hoofdstraat vs. *de Emma, GE die HauptstraBe vs. ??die Emma). 
Apart from that, name class has been identified as a decisive factor (EN *the 
Queen Mary (person) vs. the Queen Mary (ship); DU *de Koningin Mary vs. de 
Queen Mary). In English, proper names remain undetermined in most cases, 
which also holds true for many non-prototypical names with appellative heads 
(e.g. *the Bakerstreet, *the Mount Everest - but the [River] Thames, the Atlantic 
[Ocean]). Strikingly, the limited use of onymic articles in English as compared to 
Dutch and German parallels the situation with semantically definite uniques, 
where the expletive definite article likewise occurs less consistently (*the heaven, 
*the paradise, (the) earth) and, accordingly, is best described as a definiteness 
marker in a strict sense. In German, at the other end of the continuum, the func- 
tional expansion is most advanced, as, in large parts (Central and South Ger- 
man), the onymic article also combines with personal names in informal speech. 
As has been shown, personal names preceded by an onymic article are not a 
unknown feature in Dutch, either, but limited to southern dialects and to names 
of men. 
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A diachronic contrastive study of 
sentence-internal capitalisation in Dutch 
and German 


Abstract: The present contribution analyses the sentence-internal capitalisation 
practice in selected Dutch bibles printed between 1450 and 1750. The use of majus- 
cules proves to be highly sensitive to word class, i.e. it almost exclusively affects 
nouns. The Dutch case exhibits clear parallels to the emergence and development 
of sentence-internal capitalisation in German: In both languages, the majuscule 
was first conventionalised in proper names. Within common nouns, the use of 
uppercase letters is initially driven by pragmatic factors (i.e. emphatic and/or 
honorific use). By the end of the 16th century, however, the use of majuscules is 
increasingly motivated by cognitive factors, mainly animacy and concreteness of 
the referent. Finally, the comparison of Dutch bible prints with their German tex- 
tual basis shows that Dutch printers did not adapt the capitalisation conventions 
of the German source-text on a one-to-one basis. Rather, Dutch printers appear to 
have temporarily established a capitalisation practice of their own with a clear 
preference to uppercase concrete nouns as opposed to abstract nouns. However, 
the capitalisation practice is generally characterised by a tremendous inconsis- 
tency across the single Dutch bible prints throughout the whole period under con- 
sideration. This inconsistency is considered to be one reason for the fact that sen- 
tence-internal capitalisation was abandoned in Dutch spelling in the long-run. 


Zusammenfassung: Der vorliegende Beitrag untersucht die satzinterne Majuskel- 
praxis in ausgewählten niederländischen Bibeldrucken zwischen 1450 und 1750. 
Der Majuskelgebrauch erweist sich als extrem wortartsensitiv, d.h. er betrifft fast 
ausschließlich Substantive. Dabei tun sich beim Vergleich mit der Entwicklung der 
satzinternen Großschreibung im Deutschen deutliche Parallelen auf: So konsoli- 
diert sich die Majuskel auch im Niederländischen zuerst bei Eigennamen. Inner- 
halb der Appellative dominiert ebenfalls zunächst eine pragmatisch gesteuerte 
Großschreibungspraxis (d.h. als Hervorhebungs- und/oder Ehrerbietungssignal), 
die im ausgehenden 16. Jahrhundert vom semantisch-kognitiven Faktor der 
Belebtheit bzw. Konkretheit des Referenten abgelöst wird. Der Vergleich zwischen 
niederländischen Bibelausgaben und ihren deutschsprachigen Übersetzungs- 
vorlagen zeigt schließlich, dass der Majuskelgebrauch nicht eins zu eins aus dem 
Deutschen übernommen wurde. Vielmehr etabliert sich im Niederländischen die 
Tendenz zu einer beinahe exklusiven Großschreibung von Konkreta gegenüber 
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kleingeschriebenen Abstrakta. Allerdings divergiert der Majuskelgebrauch z.T. 
immens zwischen den einzelnen Bibelübersetzungen. Diese über den gesamten 
Untersuchungszeitraum dokumentierte Inkonsistenz dürfte mitunter für die Rück- 
nahme der satzinternen Majuskel im Niederländischen verantwortlich sein. 


1 Introduction 


Sentence-internal capitalisation is probably one of the most outstanding hallmarks 
of modern German orthography: Each noun or nominalisation functioning as the 
head of the noun phrase is consistently marked with a majuscule, an orthographic 
rule which goes far beyond the capitalisation practices found across other standard 
language orthographies based on the Latin alphabet. Here, the use of sentence- 
internal majuscules is mainly restricted to a subclass of the noun category: the 
proper names. However, Maas (1995: 90) asserts that the capitalisation of nouns 
in sentence-internal position was once a pan-European phenomenon of Early 
Modern times. Though his claim remains rather tentative (he does not provide any 
empirical evidence), it is at least partially supported for the spelling systems of 
northwestern European languages either by anecdotal evidence in the literature or 
the few studies on the emergence and development of sentence-internal capitali- 
sation in French (cf. Husson 1977; Meisenburg 1990) and English (see Osselton 
1984, 1985; Schnaar 1907). Moreover, Bunčić (2012: 242) points out that majuscules 
were more widely spread in Polish in the 16th century. The use of uppercase letters 
in common nouns is attested for French until the 17th/18th century, for Swedish 
(cf. Maas 2007) and English until the 18th century, and for Icelandic until the 19th 
century (cf. ibid.). In Norwegian (cf. Lundeby/Torvik 1956) and Danish (cf. Ham- 
burger 1981) we even find a sentence-internal capitalisation practice comparable 
to that of modern German orthography. In both languages, however, sentence- 
internal capital letters for common nouns were abandoned in the course of 
spelling reforms: in Norwegian in 1907, in Danish in 1948.1 

Maas (1995, 2007) assumes that the practice to capitalise words in sentence- 
internal position originated from Germany, where it is increasingly attested in 
book prints since the 16th century and was then conventionalised during the 17th 
and 18th centuries (see section 2.1). According to Maas, this innovation diffused 
more broadly through cultural contact, i.e. the printing industry, by German book 


1 I would like to thank two anonymous reviewers and Gunther De Vogelaer for helpful comments 
on a previous draft of this paper. 
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printers who spread their craft knowledge and capitalisation habits throughout 
Europe. The European book trade as well as book printing, was indeed dominated 
by Germans in the first decades since the printing revolution in Germany in the 
middle of the 15th century (cf. Hruschka 2012: 38). Hence, most early European 
printers were either Germans or had been trained in Germany. Though it seems 
reasonable to consider the pan-European sentence-internal capitalisation practice 
of Early Modern times as a type of contact-induced phenomenon, Maas’ claim has 
not yet been verified empirically. This is precisely where the present contribution 
steps in: It aims to test Maas’ hypothesis for Dutch, which is assumed to have at 
least temporarily exhibited a sentence-internal capitalisation tendency (cf. Maas 
2007: 398). For this purpose, a corpus of 26 Dutch bible prints covering the period 
between 1450 and 1750 will serve as a testing ground (for details see section 3.1). 
Bible prints are particularly suitable for this purpose because identical passages 
can synoptically be compared to each other and to their German counterparts. 
This allows us to reliably detect convergences and discrepancies in the capitali- 
sation practice between Dutch and German vernacular translations that are not 
attributable to the text type as intervening variable but rather to a possible influ- 
ence of the German source which served as textual basis for the Dutch translation 
(for instance, the Biestkensbijbel 1560 and the Lutherse Vertaling 1648 which both 
relied upon a German translation of Luther’s Bible). Thus, if the German capitali- 
sation practice did exert an influence on Dutch spelling, it is most likely to be 
detected in those bible prints that used a German source (see section 3.4). The lat- 
ter may apply even more in the light of a traditionalist writing or printing practice 
often found across bible reprints and later editions. As far as the German Bible 
tradition of Early Modern times is concerned, Sonderegger (1998: 230-233) points 
out that the editions relying on Luther avoided substantial modifications to the 
original text for centuries. In this vein, Dutch bibles based on a German source, in 
particular on Luther’s Bible, may also exhibit an above average level of sentence- 
internal capitalisation as a consequence of indebtedness to the original. This likely 
conservative trait of bible prints leading to a specific and individual bible printing 
tradition (cf. Bergmann/Nerius 1998: 76f.) suggests, however, that one must be very 
cautious about generalisations concerning the sentence-internal capitalisation 
practice in Dutch and the language internal and external factors contributing to it. 

Hence, the present study must be considered as a first contribution to the dia- 
chrony of sentence-internal capitalisation practice in Dutch and as an invitation 
for further research in this field. Whereas the use of sentence-internal majuscules 
has been neglected so far in the literature, the analysis presented here proves that 
word-initial capital letters are indeed well-documented in the bible prints (see 
section 3): As will be shown, sentence-internal majuscules in Dutch bibles 
spread along similar paths as in German and English spelling, starting out as 
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pragmatically driven reverence markers and signals of emphasis, which were 
then conventionalised in proper nouns (and some sacred nouns). Finally, the use 
ofuppercase letters was temporarily extended to common nouns in general, how- 
ever, with a clear preference for concrete nouns over abstract ones. 


2 Sentence-internal capitalisation in 
German and English 


In what follows, the emergence and development of sentence-internal capitali- 
sation will be outlined for German, which has been most extensively examined 
so far (e.g. Bergmann/Nerius 1998), and then compared to that of English, for 
which we can rely upon Osselton’s (1984, 1985) analysis of some fifty prose texts 
first edited by London printers. The focus of this diachronic sketch (section 2.1), 
which will serve as the backdrop and reference point for the analysis of Dutch 
bible prints in section 3, lies on the identification of common factors contributing 
to the use of word initial majuscules as well as of common paths for their spread 
in printed texts, on which the hypotheses for Dutch capitalisation practice will be 
elaborated (section 2.2). 


2.1 Diachronic overview 


It is generally agreed that sentence-internal capitalisation originates in a pragmati- 
cally driven usage of majuscules. Thatis, capital initials served as a signal of empha- 
sis, highlighting words of special importance to the text irrespective of their word 
class and syntactic use. As far as German is concerned, this capitalisation practice 
prevailed until the beginning of the Early New High German period, as has been 
shown by Bergmann/Nerius (1998) in their comprehensive study on the diachrony 
of sentence-internal capitalisation in printed texts of different genres from 1500 to 
1710 (see also Weber 1958; Kaempfert 1980; Labs-Ehlert 1993). This pragmatically 
driven usage corresponds to Osselton’s (1985: 54f.) “word-prominence” principle 
documented for English between 1500 and 1800 (see also Schnaar 1907: 92-98). 
Even though the prominence of a word is not tightly coupled to a specific word 
class, it is the noun category that is mainly affected by capitalisation in both lan- 
guages (cf. Bergmann/Nerius 1998; Osselton 1984, 1985; Schnaar 1907). The overall 
tendency to uppercase nouns more frequently compared to other word classes 
results to a large extent from sociopragmatic factors, i.e. capital letters are a means 
of expressing reverence and respect either to deities, mainly to God and other con- 
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tiguous theological concepts (i.e. nomina sacra, e.g. ‘church’, ‘prophet’), or to indi- 
viduals of high social status (e.g. official titles). Remnants of such a honorific capi- 
talisation usage are still found in various orthographic systems of the world’s 
modern languages, compare - for instance - the convention to capitalise ‘God’ and 
the pronouns referring to him in religious writings in English <He, Him, His> as 
well the capitalisation of formal address pronouns, amongst others in German (<Sie, 
Ihnen> etc.), Italian (<Lei, Loro> etc.) or Polish (<Ty> ‘thou’) (cf. Back 1995: 57f.). 

Besides this emphatic or honorific capitalisation practice, the diachronic 
data reveal a cognitively motivated use of uppercase initials, mainly involving the 
categories individuality and animacy. The first relates to the conceptualisation of 
referents/entities in terms of saliency for the language user: Proper nouns as rigid 
designators with an identifying function (monoreferentiality) and individualising 
properties (they are inherently definite) are thus located at the top of the individu- 
ality scale compared to common nouns. Animacy, which is strongly interrelated 
with individuality, is based on the life concept, spanning from human entities 
through animate and inanimate objects (cf. Yamamoto 2008: 1). Individuality 
explains the fact that proper nouns were uppercased earlier and more consequently 
compared to their common noun counterparts, a tendency that is documented for 
French (cf. Meisenburg 1990) and which has also prevailed in many other modern 
orthographies (cf. Back 1978, 1991, 1995). In German, the capitalisation of proper 
nouns was conventionalised by the first half of the 16th century (cf. Bergmann/ 
Nerius 1998), in English probably during the 16th century (cf. Osselton 1985: 53, 
Schnaar 1907: 94). In the case of German, the interaction between individuality and 
animacy becomes evident in the fact that the capitalisation of person names was 
conventionalised earlier than in other proper noun classes (e.g. toponyms). The 
higher an entity is located on the animacy scale, the more it is perceived as individ- 
uated in human cognition, and the more likely it will be highlighted by a word 
initial majuscule (cf. Szczepaniak 2011; Barteld/Hartmann/Szczepaniak 2016). 

Animacy also played a key role in the further spread of sentence-internal 
capital letters in common nouns. This is best-documented for German, where 
person designations feature majuscules earlier and more frequently than nouns 
denoting non-human animate entities (e.g. animals or plants), which in turn are 
capitalised more consistently than concrete but inanimate ones. The last subclass 
to exhibit capital initials with a certain regularity is that of abstract nouns (in 
German during the 18th century). Osselton (1985: 56) observes a similar, semanti- 
cally driven capitalisation practice for English in the 17th and 18th centuries, with 
some twenty animate nouns being consequently uppercased as opposed to other 
concrete but inanimate ones. This suggests that language-independent cognitive 
principles may have also played a key role in the capitalisation tendencies 
detected in Dutch bible prints (see section 3.3.2). 
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In addition to the influence of cognitive principles on the sentence-internal 
capitalisation practice, further factors have been pointed out to have an impact 
on the use of uppercase letters, most notably syntactic ones. Maas (1995, 2007) 
assumes that initial majuscules were introduced to mark syntactic boundaries, in 
the case of German to signal the right boundary of a noun phrase as a supporting 
parsing strategy for the reader (for English see Grüter 2009). Dücker (forthc.) tests 
Maas’ hypothesis and disproves it empirically, showing that majuscules are 
mainly used to highlight the inner structure of complex genitives phrases: Here, 
however, it is the noun of the genitive attribute that is uppercased, whereas the 
head noun is lowercased. Recent studies on the emergence and development of 
sentence-internal capitalisation in German strongly suggest a multifactorial 
approach to satisfactorily account for the numerous possible influencing and 
interacting factors, including agency, the letter’s shape, idiolectal parameters etc. 
(cf. Barteld/Hartmann/Szczepaniak 2016; Dücker/Hartmann/Szczepaniak forthc.). 

As Osselton (1984: 127, 1985: 50) proves, English once came very close to the 
capitalisation practice of modern German, highlighting almost 100% of the com- 
mon nouns in a running text during the 18th century, a capitalisation rate that 
corresponds to the state of development found in German book prints around 
1700 (cf. Bergmann/Nerius 1998; Bergmann 1999: 66). The graphs in Figure 1 
illustrate the sentence-internal capitalisation patterns for common nouns in 
English and German for the period between 1550 and 1800 based on Osselton’s 
(1984: 127) and Bergmann/Nerius’ (1998) data. 


capitalisation German ~+- 
rate in % English 


100 


50 


1600 1700 1800 


Fig. 1: Capitalisation patterns for common nouns in English and German 


Two things are striking when comparing the patterns of English and German in 
Figure 1: First, German clearly precedes English with respect to the convention- 
alisation of uppercased common nouns: Around 1600, more than 70% were capi- 
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talised in German, whereas in English, this capitalisation rate is reached a hun- 
dred years later. Whether and to what extent this time-lag can be directly attributed 
to the influence of German capitalisation practice gradually diffusing throughout 
Europe as suggested by Maas (1995, 2007) remains an open question. Neverthe- 
less, the trend to highlight common nouns with a majuscule drastically increases 
in both languages within approximately a century only. However, and that leads 
us to the second striking observation, the temporarily convergent developments 
rapidly drift apart by the turn of the 19th century: Whereas the use of sentence- 
internal majuscules is conventionalised in written German during the 18th century 
(and even extended to nominalisations, cf. Bergmann/Nerius 1998), English is 
characterised by a “sudden drop after 1750” (Osselton 1984: 127), thus falling back 
to the capitalisation practice which had been in use before 1550, see Figure 1. 

Various arguments have been put forward in the literature to explain this 
abrupt and radical change. Maas (2007: 398) assumes that the introduction of 
minuscules for common nouns was politically and/or religiously motivated, i.e. it 
was a counter-reaction of the reformation and counter-reformation movements in 
European countries breaking away from a spelling convention strongly loaded with 
confessional, mainly Lutheran connotations. In this spirit, Luther’s works were 
censored in several, mainly Catholic European countries such as France and Spain, 
but also in England during the Anglican Revolution in the 16th century (cf. Müller- 
Oberhäuser 2011a, b). However, Osselton’s data clearly show that sentence-internal 
majuscules were introduced in common nouns after the Anglican Revolution (see 
Figure 1), proving Maas’ explanation - at least for the English case - untenable. 

Osselton (1984: 128), in turn, attributes this change to the fact that sentence- 
internal capitals lost their emphatic potential to highlight words carrying a special 
emphasis or a greater semantic weight once they had developed into “a marker of 
word-class”. As a consequence of this functional change, they became “generally 
redundant [...] and could be dropped without inconvenience” (Osselton 1984: 
128). The general problem with this argument is that majuscules were and are 
still used with proper nouns in English, i.e. they serve at least as an exclusive 
marker of a nominal subclass, a convention that also diminishes their emphatic 
potential. The abandonment of capitalised nouns was maybe less a matter of an 
increasing functionalisation as lexical class markers than the logical conse- 
quence of an overextensive use of sentence-internal majuscules attested 
throughout the 18th century, where (almost) every word of a sentence was 
uppercased, a spelling convention that comes close to the capitalisation prac- 
tice found in English book titles (cf. Osselton 1985: 51f.). In this case, it seems 
reasonable that a capitalisation practice running riot contributed to an entire 
loss of any useful function sentence-internal majuscules “might previously 
have had” (Osselton 1985: 59). 
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2.2 Interim conclusion and hypotheses 


The diachronic data thus strongly suggest the following common paths with 
respect to the use and spread of sentence-internal majuscules: The nucleus of 
capitalisation practice lies in a pragmatically driven usage, which then expands 
along the lines of a cognitive-semantic continuum ranging from human via non- 
human animate and inanimate concrete towards abstract entities. Within this 
gradual process, proper nouns and nomina sacra precede common nouns. Once 
all nouns have been captured by capitalisation, sentence-internal uppercase let- 
ters function as markers of a lexical class. This lexical principle, which has pre- 
vailed in English spelling throughout the 18th century but was then given up, 
has been superseded in German orthography by a syntactically motivated capi- 
talisation principle: each head of a noun phrase requires a majuscule. Against 
this backdrop, the following hypotheses can be derived for the sentence-internal 
capitalisation practice, if there is any, in Dutch bible prints: 


1. Itis assumed that majuscules will be highly sensitive to the word-class, with 
nouns being generally more prone to feature initial capitals than any other 
word category (section 3.1). 

2. It is expected that the use of majuscules will be first conventionalised in 
proper nouns and then in words of theological importance (i.e. nomina 
sacra), which in turn are expected to precede all other common nouns with 
respect to capitalisation (section 3.2). 

3. The earliest attestations of word initial majuscules in common nouns are 
expected to be pragmatically driven, i.e. they highlight words of specific sig- 
nificance to the (con)text (section 3.3.1). 

4. Within common nouns, those denoting concrete entities as opposed to those 
referring to abstract concepts will be more prone to feature an initial capital. 
Within concrete nouns, the use and spread of capital letters is expected to 
reflect the animacy scale (human > animate > non-animate entities; section 
3.3.2). 


The influence of syntactic factors (e.g. the complexity of the NP), which are not 
pursued here, will be the task of future studies. Finally, the question whether 
and to what extent the use of sentence-internal majuscules in Dutch bibles is 
attributable to an influence of the capitalisation practice in German bible editions 
that served as textual bases will be addressed in section 3.4. 
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3 Sentence-internal capitalisation 
in Dutch bible prints 


This section is concerned with an in-depth analysis of the sentence-internal capi- 
talisation practice in Dutch bibles across four centuries (sections 3.2-3.4). Before 
presenting the results, the data and the methodology will be outlined (section 3.1). 


3.1 Corpus and Methodology 


For the present study a corpus of 26 Dutch bible editions spanning from the 15th to 
the 18th centuries was compiled relying upon the Biblia Sacra database”, compare 
Table 1: thirteen prints are first editions issued between 1477 and 1648, which will 
serve as the starting point of the analysis in sections 3.2-3.3. One part of the bible 
corpus comprises Dutch translations either exclusively or partially based on a Ger- 
man source text (e.g. Luther’s Bible) as in the case of the Lutherse Vertaling 1648, 
the other part includes vernacular versions of the bible directly translated from the 
original sources, i.e. the Latin Vulgate (e.g. Delftse Bijbel 1477) or the Greek and 
Hebrew text (e.g. Statenbijbel 1637; for details see section 3.4). In the case of the 
historically most significant translations into Dutch, i.e. the Biestkensbijbel (1560), 
the Deux-Aesbijbel (1562), the Statenbijbel (1637), and the Lutherse Vertaling 
(1648), the corpus was extended by three to four subsequent editions, see Table 1: 
This allows us to capture possible modifications in the capitalisation practice 
within these bible versions (see section 3.3.2). 

The analysis of sentence-internal capitalisation was exemplarily conducted 
on Genesis Chapter 1 (Book Moses, Old Testament), which comprises about 750 
graphematic words in the sense of Fuhrhop (2008), i.e. written words between 
two spaces (e.g. <_Licht_> ‘light’) or between a whitespace and a punctuation 
mark (e.g. <_Wateren.> ‘waters’ or <_Dach/> ‘day’). The total of tokens may 
slightly diverge in the single bible editions due to the translation practice itself 
(e.g. <wilde dieren> ‘wild animals’ vs. <beesten> ‘beasts’) or due to the amount of 
graphic contractions in the text, compare, e.g. ‘in the’: <INDen> (Delftse Bijbel 
1477), <IN den> (Deux-Aesbijbel 1562), <IN’t> (Lutherse Vertaling 1648). Each 
graphematic word was then assigned to its lexical/grammatical word class using 
STTS tags. Of the 750 tokens, c. 200 are nouns, c. 130 verbs, c. 45 adjectives includ- 
ing a few numerals, c. 15 adverbs, while the remaining 360 tokens account for 


2 For details concerning the Biblia Sacra database see Aalderink/Verbraak (2006). 
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functional, i.e. grammatical word classes such as articles, prepositions etc. Again, 
smaller deviations in the distribution over word-classes between the single bible 
editions can be attributed to the translation practice, for instance, some editions 
tend to pronominalize introduced referents more often, which directly affects the 
total amount of nouns and pronouns in the text. Moreover, the data were coded 
for animacy using the classification of the DFG-project “Development of Sen- 
tence-internal Capitalization in German”? which assigns nouns into the following 
entities (cf. Barteld/Hartmann/Szczepaniak 2016): super-human (e.g. god), 
human (e.g. mensche ‘mankind’, wijf ‘woman’), animate (e.g. dieren ‘animals’, 
voeghel ‘bird’), inanimate but concrete (e.g. aerde ‘earth’, zee ‘sea’), and abstract 
(e.g. beghin ‘beginning’, onderscheit ‘difference’). 


Table 1: Dutch bible corpus 


15th century 16th century 17th century 18th century 
- 1477 Delftse B. - 1513 Bijbelintcorte - 1637 Statenbijbel - 1711 Biblia Pentapla 
- 1526 Liesveltb. (reissues 1670, 
- 1528 Vorstermanb. 1708, 1747) 
— 1548 Blanckartb. — 1648 Lutherse V. 
— 1548 Leuvense B. (reissues 1671, 
- 1558 Emden Bijbel 1701, 1748) 


— 1560 Biestkensb. 
(reissues 1582, 1646, 
1702, 1750) 

- 1562 Deux-Aesb. 
(reissues 1579, 1597, 
1633) 

- 1599 Moerentorfb. 


The list under (1) comprises all nouns attested in Genesis Chapter 1 and arranges 
them according to the animacy scale, using Modern Dutch spelling for the sake of 
reader-friendliness. The distinction between concrete and abstract common 
nouns is based on Ewald (1992), who defines concrete nouns as concepts that are 
sensory perceivable phenomena in the broadest sense. In prototypical concrete 
nouns, several sensory perceptions are at work, for instance, a cat can be visually 
and acoustically perceived (but it can also be touched), in the case of food the 


3 The original project title: “Entwicklung der satzinternen Großschreibung im Deutschen” (cf. 
Barteld/Hartmann/Szczepaniak 2016). 
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gustatory and olfactory systems are involved. In contrast, in the case of periph- 
eral concrete nouns, perception is restricted to one sense, compare light (vision) 
or noise (audition). Though the lemma geest ‘spirit’ generally represents a prime 
example of an abstract noun in the sense of Ewald (1992: 279), since it cannot be 
perceived by any sense, it has been classified under the sub-category “super- 
human” because it is exclusively used to refer to the ‘spirit of God’ (e.g. gheest 


Godts) in Genesis 1. 


(1) Nouns attested in Genesis 1 


CONCRETE svei eenn an sve alee rn dani ea ABSTRACT 


super-human human animate non-animate 
god ‘god’ man ‘man’ beest ‘beast’ aarde ‘earth’ aangezicht ‘face”* 
geest ‘spirit’ mens ‘human; boom ‘tree’ aardbodem ‘ground’ aard ‘kind’ 


mankind’ 


’ 


vrouw ‘woman 


boomvrucht ‘tree fruit’ 


afgrond ‘abyss’ 


avond ‘evening’ 


wijf ‘woman’ 


dier ‘animal’ 


diepte ‘abyss’ 


beeld ‘image’ 


gedierte ‘animals’ 


droge ‘the dry’ 


begin ‘beginning’ 


gevogelte ‘birds’ 


duisternis ‘darkness’ 


dag ‘day’ 


gewemel ‘swarm’ 


firmament ‘void’ 


gelijkenis ‘image’ 


gewormte ‘worms’ 


hemel ‘heaven, sky 


geslacht ‘kin’ 


gras ‘grass’ hout ‘wood’ heerschappij ‘reign’ 
kruid ‘herb’ licht ‘light? jaar ‘year’ 
vee ‘livestock’ meer ‘sea’ leven ‘life’ 


vis ‘fish’ plaat ‘(earth) plate’ midden ‘the middle’ 
vogel ‘bird’ plaats ‘place’ maan ‘month’ 
vrucht ‘fruit’ spijs ‘food’ morgen ‘morning’ 
walvis ‘whale’ ster ‘star’ nacht ‘night’ 
water ‘water’ onderscheid 
‘difference’ 


4 Since aangezicht appears in the context en[de] die duysternisse was op dat aeugesicht des 
afgrondts ‘and darkness was over the face of the deep’ (Blanckartbijbel 1548, Gen. 1:2), it has 
been classified as abstract noun. 
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CONCRETE ABSTRACT 


super-human human animate non-animate 


zaad, (be)zaadsel ‘seed’ teken ‘sign’ 


zee ‘sea’ versammeling, 
vergadering 
‘gathering’ 


Ziel ‘soul’ 


Before going into further details, it is worth noting the overall distribution of sen- 
tence-internal majuscules across word classes. Capital letters are almost exclu- 
sively used to highlight nouns, whereas other word classes only exceptionally 
exhibit majuscules, thus confirming hypothesis 1 formulated under section 2.2: 
Uppercase letters prove to be highly sensitive to the noun category. In the case of 
Genesis 1, the capitalisation of non-substantival words concerns the numerals 
used to refer to the six days of creation, e.g. <de Eerste dagh> ‘the first day’, and 
the pronouns referring to God, e.g. <Hy, Hem>. In what follows, the analysis of 
sentence-internal majuscules will be restricted to nouns. 


3.2 Proper nouns and nomina sacra 


Within the fourteen first editions, sentence-internal majuscules are not attested 
earlier than the 16th century. They are introduced for the first time in the Bibel int 
Corte (1513) to highlight the person names <Adam> and <Eua> (the remaining 
bible translations of Genesis 1 do not exhibit any proper nouns at all), while 
<god> as nomen sacrum is still lowercased. However, this changes by the second 
quarter of the 16th century, when majuscules are not only consequently intro- 
duced as reverence markers for <God(t)> in the Liesveltbijbel (1526) but also in all 
subsequently issued bible prints, see the examples from the bible editions between 
1477 and 1548 under Table 2. 


Table 2: Capitalisation of God in early Dutch bible prints 


1477 1513 1526 1528 1548 1548 
Delftse Bibel Bibelint Corte Liesveltbijbel Vorstermanb. Blanckartbijbel Leuvense Bijbel 


<god> <god> <God(t)> <God> <God> <Godt> 
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Since god is used monoreferentially to refer to the Christian God (here, it cannot be 
pluralised as opposed to contexts where the concept generally refers to individual 
deities, see Kopf forthc.), it can also be classified as a proper noun. Hence, it is 
hardly surprising that the initial majuscule is conventionalised here first, namely 
from the second quarter of the 16th century onwards, and only later in the case of 
geest ‘spirit’ in the NP <Gheest Godts, Godts Gheest>, which is consequently 
uppercased since the Emden Bijbel (1558). The capitalisation of God and Geest 
has prevailed in modern Dutch bible translations. 

Though proper nouns and nomina sacra respectively seem to precede common 
nouns of non-theological importance with respect to sentence-internal capitalisa- 
tion, it is not possible to draw any general conclusions solely on the basis of two 
lemmata. Therefore, an additional analysis of Genesis Chapter 2 (comprising c. 600 
tokens, count based on the Statenbijbel 1637) was carried out, which includes sev- 
eral proper nouns such as names of persons (e.g. Adam), rivers (e.g. Euphrates, 
Gihon, Pishon, Tigris, also translated as <Hydekel>), and places (e.g. Assyria, Ethi- 
opia, also translated as <Mo(o)renlant>, Havilah), as well as the sacred nouns 
‘Lord’ (here) and ‘paradise’, the latter translated either as paradijs (e.g. Delftse 
Bijbel 1477; Liesveltbijbel 1526; Leuvense Bijbel 1548), lusthof ‘pleasure garden’ 
(e.g. Vorstermanbijbel 1528; Deux-Aesbijbel 1562) or hof ‘yard’ (e.g. Biestkensbij- 
bel 1560; Statenbijbel 1637; Lutherse Vertaling 1648), compare Table 3: Whereas 
proper nouns are still lowercased in the Delftse Bijbel (1477), they are conse- 
quently capitalised from the 16th century onwards, compare the examples under 
(a) (only exception: moorianen lant in the Leuvense Bijbel 1548). Majuscules in 
nomina sacra, in contrast, are conventionalised not earlier than the middle of 
the 16th century, compare also the examples under (b). 

Since an increasing capitalisation tendency in common nouns is not attested 
before the second half of the 16th century at the earliest, the data confirms hypoth- 
esis (2) (see section 3.3). More interestingly, the convention to uppercase proper 
nouns and nomina sacra in Dutch bibles coincides with the capitalisation practice 
attested in Early New High German prints, where initial majuscules are convention- 
alised in proper nouns around 1530 at the latest and in nomina sacra around 1560 
(cf. Bergmann/Nerius 1998; Bergmann 1999: 69), see (2) (based on Table 3). 


(2) Conventionalisation of initial majuscules in noun classes 


proper nouns nomina sacra common nouns 
German c. 1530 c. 1560 c. 1590 
Dutch c. 1526 c. 1558 never fully conventionalised 
(first attested in the Vorstermanb. 1528) 
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Table 3: Capitalisation of proper nouns versus nomina sacra 


1477 1526 1528 1548 1558 
Delftse Liesvelt- Vorsterman- Leuvense Emden 
Bijbel bijbel bijbel Bijbel Bijbel 
(a) phison Phison Phison Phison Pison 
propsr euilath Heuila Heuilah Heuilath Heuila 
nouns 
gyoln] Gihon Gion Gehon Gihon 
ethiopie[n] --- Moore[n]la[n]t Moorenlant moorianen lant Morenlant 
tygris Tigris Tigris Tygris Hydekel 
Assierien Assyrie[n] Assyrien Assyrien Assyrien 
Eufrates Euphrates Euphrates Euphrates Phrath 
a(a)dam Ada[m]? Adam Adam Adam 
Eua Eua Heua Eua 
Eden Eden 
(b) God God God(t) God(t) God(t) 
nomina 
Here HERE HEERE Heere HEERE 
sacra 
Paradijs paradijs paradijs, paradijs Paradijs 
Paradijs 
hof Hof 


3.3 Majuscules in common nouns 


As already outlined in section 2.1, initial majuscules in German common nouns 
other than nomina sacra started out as an emphatic device to highlight words of 
special importance to the text (pragmatic use), and then spread along the animacy 
scale, thus becoming conventionalised first in nouns referring to persons around 
1590, followed by animate and inanimate concrete nouns around 1620, and then 
by abstract nouns on the threshold of the 18th century, which marks the functional 
change of the initial majuscule from a semantically driven graphic device to a 


5 The Liesveltbijbel (1526) uses <Mannine> instead of Adam: Me[n] salse Mannine hoete[n] (Gen. 


2:23). A similar translation is also found in the Leuvense Bijbel (1548) and the Emden Bijbel 


(1558): Dese sal Manninne ghenaemt worden (Gen. 2:23). 
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marker of lexical class (cf. Bergmann/Nerius 1998). This development is also 
reflected in Luther’s High German bible editions from 1523, 1534, and 1545, for 
which Genesis Chapter 1 was additionally analysed to provide a basis of compari- 
son with the results of the Dutch bible prints.° 

In the first version from 1523, sentence-internal capital letters are sparingly 
used in common nouns (ca. 3% ofthe cases), and this only to highlight the words 
relating to the creation of the day and the night (<Tag, Nacht>), the creation of 
the sky/heaven, the earth, and the sea (<Hymel, Erde, Meere>). This pragmatic 
use of majuscules is only documented in Dutch bibles from the first half of the 
16th century (see section 3.3.1). In the second edition from 1534, the capitalisation 
ratio in common nouns has tripled (ca. 9%), mainly due to amore consistent use 
of initial majuscules in concrete nouns such as <Feste> ‘void’, <Liechter> ‘lights’, 
and <Himel> ‘sky, heaven’. Within only twenty years, sentence-internal capitali- 
sation shoots up drastically to an average of ca. 82%, with 98% of the concrete 
nouns being uppercased as opposed to only 43% of the abstract nouns. This diver- 
gent behaviour of concrete versus abstract nouns with respect to capitalisation is 
also attested in the Dutch bibles under consideration (section 3.3.2). However, 
Dutch seldom scores capitalisation levels as high as its German counterparts. 


3.3.1 Emphatic use of majuscules 


A pragmatic capitalisation practice comparable to that of Luther’s earliest version 
of the bible (1523) concerns the use of uppercase letters in words relating to the 
topic of the Genesis Chapter 1: the creation, more precisely, the single creation 
steps, a tendency that is first documented in the Vorstermanbijbel (1528). In this 
vein, the majuscule is found in the Blanckartbijbel (1548) in <Beghin> to emphasise 
the very beginning of the world or in the <Firmament> ‘void’ created between the 
waters (Gen 1:6), which God calls <Hemel> ‘sky’. It is precisely the act of naming 
that entails capitalisation. This can be impressively demonstrated by the Vorster- 
manbijbel, see the examples in (3)-(5): Here, nouns like dach ‘day’, nacht ‘night’, 
hemel ‘sky’, aerde ‘earth’, and zee ‘sea’, which are usually written with minuscules 
(see (3)-(5b)), are capitalised in the context of naming (see (3)-(5a)): 


(3) a. “En naemde dlicht/den Dach/ende die dusternissen/den Nacht/” 
‘And he called the light day, and the darkness night’ 


6 For the capitalisation practice in further German bible prints of the first half of the 16th century, 
see Risse (1980). For the capitalisation practice in Luther’s handwritten letters see Moulin (1990). 
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b. 
(4) a. 
b. 
(5) a. 
b. 


“ende dmeeste licht den daghe voorzijn soude/en dminste licht/ 
dat=tet den nachte voorzijn soude” 

‘and the greater light to govern the day, and the lesser light to govern 
the night’ 


“En God namede dat firmament/den Hemel/” 
‘And God called the void sky’ 

“Die wateren die onder den hemel zijn” 

‘The waters that are under the sky’ 


“En God naemde die drooghe plaetse/Aerde/en die vergaderin=ge 
des waters/naemde hi die Zee/” 

‘And God called the dry ground earth, and the gathered waters he 
called seas’ 

“ende veruult dat water der zee, ende dat geuogelte vermenichfuldige 
hem opter aerden” 

‘fill the water in the seas, and let the birds increase on the earth’ 


Other examples of a pragmatically driven capitalisation usage supporting 
hypothesis (3) are found in hemel ‘sky’, aerde ‘earth’, zee ‘sea’, walvisschen 
‘great creatures of the sea’ (lit. ‘whales’), dagh ‘day’, nacht ‘night’, man ‘man’, 
and wijf, resp. vrouwe, ‘woman’ throughout the first half of the 16th century, 
which all represent God’s products of creation. These eight nouns are also the 
ones that are most frequently attested and more consequently highlighted with 
capital letters throughout the first editions, as indicated by the “+” in Table 4 
(original spelling retained). 


Table 4: The most frequently capitalised nouns in first editions 


lemma 1528 1548 1558 1560 1562 1599 1637 1648 
hemel ‘sky, heaven’ + + + + + + + + 
aerde ‘earth’ + --- + + + + + + 
zee ‘sea’ + --- + --- + + + + 
walv. ‘whales’ --- + + + + --- + + 
dagh ‘day’ + --- + + + nn. + 

nacht ‘night’ + --- + + + a + 

man ‘man’ + + + + + 

vijf, vrouwe ‘woman’ + + + + + 
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Interestingly, apragmatic use of word initial majuscules prevails until the first half 
of the 17th century, as documented by the Lutherse Vertaling (1648). Here, only 
concepts denoting humans (i.e. <Menschen> ‘mankind’, <Man> ‘man’, <Vrouwe> 
‘woman’) are uppercased besides <Hemel, Aerde, Zee, Waluisschen>, thus reflect- 
ing the Christian conception of humans as the climax of God’s creations. 


3.3.2 Majuscules in concrete versus abstract nouns 


As already pointed out in the previous section, an increasing non-emphatic use 
of word initial majuscules in common nouns becomes tangible from the second 
half of the 16th century onwards. However, as Figure 2 shows, sentence-internal 
capital letters in Dutch bible prints not only spread at a very slow rate from the 
middle of the 16th century onwards but their use is also characterised by a clear 
inconsistency across the first editions under consideration: In most cases, capital 
letters in common nouns do not exceed the 45 percentage mark attested for the 
Emden Bijbel (1558) and the Biestkensbijbel (1560). 


90% 
80% 
70% 
60% 
50% 48% 46% 
40% 
30% 27% 
23.5% 
20% 19% 
12% 

10% 
0% 

1558 1560 1562 1599 1637 1648 

Emden Bijbel | Biestkensb. | Deux-Aesb. | Moerentorfb. | Statenvertaling | Luth. Vert. 

n= 159 159 162 162 162 161 
MA) 40 52 38 19 44 30 


Fig. 2: Capitalisation patterns for common nouns in Dutch bible editions 


7 For the translation of the Statenbijbel (1637) reissued in the Biblia Pentapla (1711), see section 3.4. 
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A closer look at the new additions is particularly instructive for determining 
further factors contributing to the capitalisation of common nouns. Most of the 
newly uppercased noun types are concrete: Furthermore, most of them denote 
animate entities (e.g. <Ghewormte, Vee, Boomen> ‘worms, livestock, trees’), thus 
supporting the assumption that sentence-internal majuscules spread along the 
animacy scale (hypothesis 4). Consistent with this argument is the observation 
that word initial capital letters are most consequently applied with human enti- 
ties throughout these six editions, i.e. with man ‘man’, vijf/vrouwe ‘woman’, and 
mensche ‘mankind’. The general tendency to capitalise the concepts denoting 
humans may also be influenced by pragmatic factors, reflecting a conception of 
the world with an ontological creation hierarchy, in which “humans are seen as the 
climax of creation content” (cf. Kennard 2013: 334). 


concsete (C) m abstract (A) 


j E 
‘ = = 


1558 1560 1562 1599 1637 1648 
Emden Bijbel | Biestkensb. | Deux-Aesb. | Moerentorfb. | Statenvertaling | Luth. Vert. 
n= 159 159 162 162 162 161 
C= 31/121 27/112 33/124 18/109 40/114 30/110 
A= 9/38 25/47 5/38 1/53 4/48 0/51 


Fig. 3: Capitalisation patterns for concrete versus abstract nouns in Dutch bible editions (in %) 


Figure 3, finally, displays the capitalisation practice in the respective six bible 
editions for concrete nouns (grey columns) as opposed to abstract ones (black 
columns). Whereas in the Emden Bijbel (1558) and the Biestkensbijbel (1560) 
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concrete and abstract nouns are capitalised almost to an equal extent (with 
abstract nouns even slightly outweighing concrete ones), the overall distribution 
of sentence-internal majuscules in the remaining four bibles clearly shows their 
strong affinity for concrete nouns. The percentage figures even suggest that while 
the use of uppercase letters in Dutch bibles generally decreases since 1560, it is 
increasingly restricted to concrete nouns at the same time: Whereas the Deux- 
Aesbijbel (1562) scores capitalisation rates two times higher for concrete as 
opposed to abstract nouns (27% vs. 13%), the levels of uppercased abstract nouns 
do not exceed 5% of the cases since 1599, see Figure 3. 

The inconsistency in the capitalisation practice documented for the first edi- 
tions becomes even more pronounced, if their subsequent editions are taken into 
account. Three to four reissues were additionally consulted for the Biestkensbij- 
bel (1560, 1582, 1646, 1702, 1750), the Deux-Aesbijbel (1562, 1579, 1597, 1633), the 
Statenbijbel (1637, 1670, 1708, 1747), and the Lutherse vertaling (1648, 1671, 1701, 
1748), see Figure 4: 


100 
80 
60 
40 
20 
0 
c. 1560 | c. 1580 | c. 1640 | c.1670 |c. 1700 | c. 1750 
AAS Biestkensb. (n = 159)| 33% 11% 74% --- 18% 64% 
Zu ze Deux-Aesb. (n= 162) | 23.5% | 50% | 31.5% 
Statenvertaling (n = 162) --- --- 27% 4.5% 0% 0% 
Luth. Vertaling (n = 161) --- --- 19% | 33.5% | 28% 64% 
erssensensensennennee trend (percentual average) | 28% | 30.5% | 38% 19% 15% 43% 


Fig. 4: Capitalisation patterns for common nouns in the editions of the Biestkensbijbel, the 
Deux-Aesbijbel, the Statenbijbel, and the Lutherse Vertaling (in %) 
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The overall picture clearly shows tremendous differences in the use of word initial 
majuscules in common nouns between these four bible versions (e.g. the Staten- 
bijbel vs. the Lutherse Vertaling) but also a great variation within the single edi- 
tions of the bible versions themselves (e.g. the Biestkensbijbel). Whereas the 
editions of the Biestkensbijbel and the Lutherse Vertaling reissued around 1750 
are characterised by a remarkable increase of uppercased nouns (c. 64% each), 
sentence-internal capitalisation is given up in the subsequent editions of the 
Statenbijbel, see Figure 4. This is particularly striking when compared to the devel- 
opments in German and English sketched in section 2.1, which are characterised 
by a steady increase of uppercased common nouns from the 16th (German) and 
17th (English) centuries onwards (see also Figure 1). In contrast, Dutch is rather 
characterised by an inconsistent zig-zag development with respect to word initial 
capitalisation illustrated by the trend-line in Figure 4. It thus can be assumed that 
the capitalisation of common nouns did not prevail in Dutch spelling for two rea- 
sons: First, the development was in itself inconsistent. Second and more impor- 
tantly, the Statenbijbel, which had a great influence on the standardisation of 
Dutch language and spelling (cf. van der Sijs 2004), banned uppercased common 
nouns in its subsequent editions. 


c.1560 | c.1580 | c.1600 | c.1640 | c.1670 | c.1700 | c.1750 
== concrete 38% 36% 44% 21% 21% 21% 58% 
(n = 348) | (n = 236) | (n = 460) | (n = 460) | (n = 224) | (n = 336) | (n = 336) 
— abstract 29.5% 16.5% 24% 22% 7% 2% 6% 
(n=132)| (n=85)| (n=91)| (n=184)| (n=99) | (n= 146) | (n= 146) 


Fig. 5: Capitalisation patterns in concrete versus abstract nouns (in %) 
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Nevertheless, it is still instructive to take a closer look at the temporarily increas- 
ing capitalisation practice as attested in the subsequent editions ofthe Deux-Aes- 
bijbel, the Biestkensbijbel, and the Lutherse Vertaling. Figure 5 displays the aver- 
age capitalisation ratios for concrete and abstract nouns for seven decades from 
1560 to 1750 that were calculated on the capitalisation ratios of these three bibles 
(for the concrete numbers, see Fig. 6-8). The graphs show that while abstract 
nouns were increasingly lowercased in the long-run (with a temporary capitalisa- 
tion peak of c. 30% around 1560), concrete nouns were generally more prone to 
capitalisation, scoring their highest levels by the middle of the 18th century. The 
overall picture suggests that the spread of majuscules within abstract nouns more 
or less stagnated since the 17th century, while there is still a continuing extension 
within concrete nouns (see also Fig. 6-8). 


90 
80 
70 
60 =, 
50 — SS 
ea BER 
40 z2 a 
ae BER 
30 2 
20 r 
e 
1562 1579 1597 1633 
concrete 27% 56% 60% 39% 
(n=124) (n = 33) (n = 69) (n = 74) (n = 48) 
abstract 13% 32% 55% 8% 
(n = 38) (n = 5) (n = 12) (n = 21) (n = 3) 
Seer average 23.5% 50% 59% 31.5% 
(n = 162) (n = 38) (n= 81) (n=95) (n= 51) 


Fig. 6: Capitalisation patterns for concrete versus abstract nouns in the editions of the 
Deux-Aesbijbel (in %) 


Similar pictures emerge for each of the three bible versions and their subsequent 
editions. In the case of the Deux-Aesbijbel, the capitalisation rate almost triples 
within only 35 years, from 23.5% in 1562 to 59% in 1597, see Figure 6. Again, this 
massive increase in the usage of majuscules in concrete nouns, mainly those 
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denoting living beings, is predicted by the animacy principle (hypothesis 4). 
Thus, nouns referring to humans are now always capitalised (<Menschen, Man, 
Wijf>), followed by animate (e.g. <Dieren, Vogelen>) and inanimate concrete 
entities (e.g. <Wateren, Zee>). 

In the case of the Biestkensbijbel, the edition of 1646 averages 74% com- 
pared to the first edition from 1560 with 46%, whereas the editions from the 18th 
century tend to capitalise almost exclusively concrete nouns, see Figure 7. 
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1560 1582 1646 1702 1750 
concrete 24% 13% 75% 26% 87.5% 
(n = 112) (n = 27) (n = 15) (n = 74) (n = 29) (n = 98) 
ee abstract 53% 4% 70% 0% 8.5% 
(n=47) (n = 25) (n=2) (n = 33) (n=0) (n=4) 
STEHE average 33% 11% 74% 18% 64% 
(n=159) (n= 52) (n=17) (n= 117) (n = 29) (n= 102) 


Fig. 7: Capitalisation patterns for concrete versus abstract nouns in the editions of the 
Biestkensbijbel (in %) 


The general increase in capitalisation can on the one hand be attributed to a more 
consistent use of majuscules within nouns that had already been uppercased in 
1560, mainly concrete ones such as <Menschen> ‘humankind’, <Boomen> ‘trees’, 
<Aerde> ‘earth’, and <Licht> ‘light’ as well as <Beeld> ‘image’, <Dach> ‘day’ and 
<Nacht> ‘night’ in the case of abstract nouns. On the other hand, uppercase let- 
ters are newly introduced in fifteen different types, among which there are almost 
exclusively concrete nouns: seven designating animate entities (<Dieren, Ghedi- 
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erte> ‘animals’, <Voghelen, Ghevogelte> ‘birds’, <Visschen> ‘fishes’, <Gras> 
‘grass’, <Kruyt> ‘herb’), and five referring to non-animate objects (<Water> ‘water’, 
<Duysternis> ‘darkness’, <Firmament> ‘void’, <Zaet> ‘seed’, <Zee> ‘sea’). In the 
case of the two abstract nouns tekenen ‘signs’ and tijden ‘seasons’, the introduc- 
tion of capital letters may be attributed to the context in which they appear: 
Together with dagen ‘days’ and jaren ‘years’ they are part of a noun series that 
emphasises the function of the lights, i.e. the heavenly bodies (e.g. the Sun, Moon 
etc.) created by God to separate day from night, and also to be signs for seasons, 
days, and years (cf. Genesis 1:14): 


1560: Daer worden lichten aen dat firmament des Hemels/ende 
scheyden dach ende nacht/ende zijn 
tot teeckenen/tijden/daghen ende Jaren. 


1646: Daer worden Lichten aen dat Firmament des Hemels/en= 
de scheyden Dach ende Nacht/ende zijn tot 
Teeckenen/Tijden/Daghen ende Jaren. 
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1648 1671 1701 1748 
concrete 27% 43% 38% 89% 
(n=110) (n = 30) (n=47) (n = 42) (n = 98) 
abstract 0% 14% 6% 10% 
(n= 51) (n = 0) (n=7) (n = 3) (n=5) 
BERL: average 19% 33.5% 28% 64% 
(n = 161) (n = 30) (n = 54) (n = 45) (n = 103) 


Fig. 8: Capitalisation patterns for concrete versus abstract nouns in the editions of the Lutherse 


Bijbel (in %) 
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The tendency to almost exclusively capitalise concrete nouns is also attested in the 
editions of the Lutherse Vertaling, see Figure 8. Whereas the use of majuscules is 
still pragmatically driven in the first edition from 1648 (see section 3.3.1), word 
initial capital letters are more frequently used in the subsequent editions. Within 
concrete nouns, majuscules are most consistently applied in words denoting 
living beings as opposed to inanimate ones, once again confirming the animacy 
principle of hypothesis (4) (e.g. <plaetsen> ‘places’, <drooge> ‘the dry’ vs. <Men- 
schen> ‘mankind’, <Man> ‘man’, <Vrouw> ‘woman’, <Vee> ‘livestock’, <Dieren> 
‘animals’, <Visschen> ‘fishes’ etc.). 


3.4 The influence of the source-text 


The remaining question is whether the sentence-internal capitalisation practice 
documented for Dutch bibles can also be attributed to the influence of German 
printers introducing their capitalisation conventions into Dutch, as assumed by 
Maas (1995, 2007). This influence is — of course - difficult to operationalise. Thus, 
relying on the distinction between Dutch bibles with a German textual basis as 
opposed to those with a non-German source text to test Maas’ assumption can 
only be a first tentative approximation based on the hypothesis that higher capi- 
talisation levels are to be expected in Dutch editions with a German textual basis 
as opposed to those which relied upon a non-German source text, be it the Latin 
Vulgate by Hieronymus or the Hebrew version of the Old Testament. The ratio 
behind this assumption is the following: It seems reasonable that Dutch printers 
may have adopted the capitalisation usage found in the German source texts in 
order to provide an accurate reproduction of the original. 

Figure 9 arranges the bible editions according to their source text with bible 
translations based on a non-German source on the left (e.g. the Delftse Bijbel 1477; 
the Statenbijbel 1637) and those with a German one on the right (e.g. the Lutherse 
Vertaling 1648). Dutch translations that partially relied on a German translation 
of Luther’s Bible were also assigned to the second group (e.g. the Emden Bijbel 
1558). The Biblia Pentapla (1711) represents a special case in so far as it is a polyglot 
edition in the widest sense, comprising the Dutch official version by the Staten- 
Generaal 1637, i.e. the translation of the Statenbijbel, plus four versions of the text 
in German — a Roman-Catholic one translated by Caspar Ulenberg, Martin Luther’s 
translation, Johann Piscator’s Reformed bible version, and a Hebrew-German (i.e. 
Yiddish) version (it is therefore excluded from Figure 9). 

As shown in sections 3.2 and 3.3, sentence-internal majuscules are missing com- 
pletely in the earliest printed Dutch bible from the late 15th century, and even until 
the middle of the 16th century, they are sparingly used except in pragmatically 
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driven but stillisolated cases, irrespective of their source-text. On the one hand, this 
may be attributed to the fact that most of these early vernacular bible prints were 
based on the Latin Vulgate, which lacked word initial majuscules as well (compare, 
for instance, the edition of the Latin Vulgate in the Gutenberg-Bible from the middle 
of the 15th century). On the other hand, we must keep in mind that the earliest 
Luther translation of the Old Testament issued in 1523, which served as textual basis 
for the Liestveltbijbel (1526), sparingly uses word initial majuscules with the excep- 
tion of ‘God’ (see section 3.3). It thus seems plausible to assume that Van Liesvelt 
only consequently adopted sentence-internal capitals in the case of God(t). 


NON-GERMAN SOURCES GERMAN SOURCES 


Latin Vulgate Luther‘s translation 


Delftse Bijbel High German | | Low German ||High German 
1477 1523 (see footnote 4) 


Den Bibel int 
Corte 1513 


Liesve 


Blanckart- Leuvense 
bijbel 1548 || Bijbel 1548 


Vorsterman- 
bijbel 1528 
Moerentorf- Emden Bijbel 
bijbel 1599 1558 
Hebrew text ofthe Biestkens- 
Old Testament bijbel 1560 


Statenbijbel Lutherse 
1637 Vertaling 1648 


Fig. 9: The textual bases ofthe Dutch bible translations 


An increasing use of uppercased nouns indeed appears for the first time in a bible 
edition that partially relied upon a German source: the Emden Bijbel 1558, which 
revised the text of the Liesveltbijbel (1526) by adapting the Low German edition of 
Luther’s Bible printed in Magdeburg in 1554. Similar capitalisation rates are also 
attested in the Biestkensbijbel (1560), which - in turn - reflects a revised version 
of the Emden Bijbel (1558) (see also Table 4). Interestingly, a synoptic comparison 


8 A digital copy of Genesis 1 is online available: https://upload.wikimedia.org/wikipedia/com 
mons/5/56/Gutenberg_Bible_B42_Genesis.JPG (last accessed: 28-2-2018). 
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of Genesis 1 reveals that neither the Emden Bijbel nor the Biestkensbijbel adopted 
the capitalisation practice found in Luther’s Low German edition from 1554 on a 
one-to-one basis, suggesting that the Dutch printers themselves must be regarded 
as intervening variable in the overall process. This is exemplarily shown in the 
excerpt from Genesis 1 under (4) [boldface by J.N.]: Hence, despite the parallel 
use of majuscules and minuscules as in <Lycht> — <Licht>, <Auende> — <Auont>, 
<duesternisse> — <duysternisse> (underlined in (4)), there are also clear differ- 
ences as shown by the boldfaced words: 


(4) Luther-Bible 1554 (Low German) Emden Bijbel 1558 
vnd ydt was duester up der Duepe/Vnd en=de het was duyster op der diepte/ende 


de geist Gades sweuede up dem water. de Gheest Gods sweef=de op den watere. 
[...] L...] 

Vnd ydt wart lycht/ En[de] het wert Licht. 

[] [...] 

Do scheidede Godt dat Lycht Doe[n] schiet Godt dat Licht 


van der Dussternisse/vnde noemede dat valn]der duysternisse/Ende noem=de dat 
Lycht dach/vnd de duesternisse/nacht. Licht/Dach/en[de] de duysternisse/Nacht. 
Do wart vth Auende vnde morgen/de Doen wert wt Auont en[de] Morgen/den 


Erste dach. eer=sten dach. 


On the one hand, in 12% of the cases capital letters are used in the Emden Bijbel 
(1558) for nouns which are not capitalised in the Low German version (e.g. ‘spirit’, 
‘light’, ‘day’, ‘night’, ‘morning’), and on the other, nouns that feature majuscules 
in the German text are lowercased in the Dutch print in 34% of the cases (e.g. 
‘depth’, ‘darkness’), see also Table 5, which summarises the total distribution of 
matches and mismatches in the use of sentence-internal majuscules in common 
nouns between Luther’s Low German edition from 1554 and the Emden Bijbel 
1558, resp. the Biestkensbijbel 1560. 


Table 5: Convergences and divergences in the capitalisation practice between Luther’s Bible 
(1554) and its Dutch counterparts 


Luther 1554 Emden Bijbel Biestkensbijbel 1560 
majuscules miniscules majuscules miniscules 
majuscules 34% 34% 33% 36% 
(n = 56) (n = 56) (n = 53) (n = 59) 
miniscules 12% 20% 12% 19% 


(n = 19) (n = 32) (n = 20) (n = 31) 
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Whereas both Dutch translations exhibit significantly (Fisher’s Exact Test: 
p < .001) lower capitalisation ratios than their Low German source-text from 1554 
(46%, resp. 45% vs. 84%), they score higher capitalisations levels in the case of 
abstract nouns: 51%, resp. 53% vs. 35%, see Figure 10: 


100 


1554 
Luther’s Bible 


1558 
Emden Bijbel 


1560 
Biestkensbijbel 


concrete 


84% 
(n =117) 


46% 
(n = 112) 


42.5% 
(n = 112) 


abstract 


35% 
(n = 48) 


51% 
(n = 47) 


53% 
(n = 47) 


Fig. 10: Capitalisation patterns in concrete versus abstract nouns in Luther’s Bible (1554) 
compared to the Emden Bijbel (1558) and the Biestkensbijbel (1560) (in %) 


Difficult to assess in this context are the results of the Deux-Aesbijbel from 1562 
and its revised versions from 1579, 1597, and 1633, which score capitalisation levels 
between 23.5% and 59% (see also Figure 7). Some researchers classify its Old Testa- 
ment translation from 1562 as dependent on Liesvelt’s edition (cf. Brandt 1952: 12, 
van der Sijs/Beelen 2009), whereas others claim that Luther’s Low German version 
from 1554 (Magdeburg) was additionally consulted by the translator Godfried van 
Winghe (cf. Vogel 1958: 133; den Besten 2016: 113). In the first case, the increased 
use of majuscules in common nouns would not be attributable to the direct source 
(the Liesveltbijbel lacks them completely) but either to the translator and/or the 
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printer. If, however, Luther’s Low German version served as textual basis as well, 
then the introduction of sentence-internal majuscules can indeed be - at least 
partially - influenced by the capitalisation practice in the Low German source. 

Against this background, the low capitalisation levels in common nouns 
scored by the first edition of the Lutherse Vertaling in 1648 are rather surprising, 
since its High German basis? exhibits significantly higher capitalisation ratios: 
19% vs. 82% (see also section 3.3). Even the Statenbijbel (1637) printed approxi- 
mately at the same time features higher capitalisation levels (c. 27%), although it 
clearly reflects a translation based on the original Hebrew source-text lacking any 
uppercased common nouns. This suggests that Dutch printers did not blindly 
follow the German capitalisation conventions. 

In this context, it is particularly interesting to take a closer look at the Dutch 
text printed together with four German versions of the bible in the Biblia Pentapla 
(1711). Although the Dutch translation is stated to represent the translation of the 
Statenvertaling (1637), the capitalisation practice fully corresponds to the conven- 
tions found for German translations with each head of a noun phrase being capi- 
talised. This is the only Dutch bible translation with a 100% capitalisation ratio. In 
this case, the introduction of word initial majuscules may well be attributed to the 
influence of German printers, since the Biblia Pentapla was issued in Germany 
(Wansbek, near Hamburg) by a German printer (Hermann Heinrich Holle). 

In sum, German printers’ influence on the capitalisation practice in Dutch 
bible prints cannot be neglected at all, but it should be kept in mind that in most 
of the cases, Dutch printers did not follow the German original text and capitali- 
sation conventions on a one-to-one basis. Further studies are needed to find out 
whether and to what extent the use of word initial majuscules, once initiated by 
the German printing habits, took on a dynamic of its own among Dutch printers, 
who - in turn - (may have) developed capitalisation conventions of their own. In 
the case of Dutch bible prints, such a scenario becomes rudimentarily tangible. 
Yet, the mere existence of a capitalisation practice in bibles does not mean that 
they were widely used in other printed texts. Again, further studies will be needed 
to provide a comprehensive picture of the capitalisation practice in Dutch. 


9 It is not clear from the title page and the foreword which High German version of Luther’s 
Bible was used, compare the subheading on the title page: “Van nieuws uijt D. M. Luthers 
Hoogh-Duijtsche Bibel in onse Neder-landsche tale getrouwelijck over-geset” ‘Newly [and] 
faithfully translated from D. M. Luther’s High German Bible into our Dutch language’. Given the 
fact Luther’s High German editions of the bible exhibited capitalisation scores higher than 80% 
since the print from 1545, we can assume that the underlying version for the Dutch translation 
exhibited at least similar or even higher capitalisation levels. 
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4 Conclusion 


The analysis of Dutch bibles printed between 1450 and 1750 has clearly shown that 
Dutch once exhibited a sentence-internal capitalisation tendency, which, how- 
ever, varied tremendously throughout the whole period depending on external 
factors such as the year of publication and the textual basis of the translation. 
Thus, the earliest bible prints from the late 15th century and the first quarter of 
the 16th century lack word initial majuscules (almost) completely, whereas the 
editions from the second quarter of the 16th century onwards slowly introduce 
uppercase letters for nouns, mainly as reverence markers in the case of God and 
other religious concepts or as emphatic signals for single words of special signifi- 
cance to the context. Hence, Dutch capitalisation practice is initially pragmatically 
driven as assumed under hypothesis 3, thus paralleling the early developments of 
German and English capitalisation practice (see sections 2.1 and 3.3.1). Moreover, 
the use of word initial majuscules in Dutch bibles proves to be as highly sensitive 
to the noun category as in German and English (thus confirming hypothesis 1; see 
sections 2.1 and 3.1). 

Within the noun category, capital letters are first conventionalised as expected 
under hypothesis 2 in proper nouns around the first quarter of the 16th century, 
and later in nomina sacra circa fifty years later (see section 3.2), while a semanti- 
cally driven capitalisation practice in common nouns with majuscules spreading 
from concrete to abstract nouns does not become tangible earlier than the second 
half of the 16th century (and in some instances, the influence of pragmatic factors 
cannot be entirely excluded, see section 3.3). Though the spread of word initial 
majuscules in common nouns partially conforms to the animacy scale as attested 
for German and for English, lexemes designating non-human animate entities do 
not always show a significantly higher affinity for majuscules than their inanimate 
concrete counterparts (see section 3.3.2). Hence, hypothesis 4 is confirmed to a 
great extent. 

As has been shown in section 3.4, German capitalisation practice probably 
exerted some influence on the use of word initial majuscules in Dutch bibles, as 
assumed by Maas (1995, 2007). However, it must be pointed out that Dutch printers 
did not adapt the capitalisation conventions of the German source-text on a one- 
to-one basis. Rather, Dutch printers appear to have temporarily established a 
capitalisation practice of their own with a clear preference to uppercase concrete 
nouns during the 17th and 18th centuries, a tendency that strikingly differs from 
the development attested in German (and English). 

Finally, the abandonment of sentence-internal capitalisation in Dutch spell- 
ing was on the one hand attributed to the tremendous inconsistency of the overall 
development in Dutch capitalisation practice, which had never been characterised 
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by a steady increase of majuscules at any time. On the other hand, the Statenbijbel, 
which had a great impact on the standardisation of Dutch language and spelling, 
was the first bible translation to consequently ban uppercased common nouns 
from its printed editions since the second half of the 17th century. 

Since this study was restricted to the analysis of bibles, the results cannot be 
easily generalised to other text sorts. Further research on this phenomenon is 
needed, which uses the present findings as a starting point but which widens the 
focus of the analysis, inter alia, on the influence syntactic factors such as NP 
complexity. 
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Joachim Kokkelmans (Verona) 
Middle High German and modern Flemish 
s-retraction in /rs/-clusters 


Abstract: Late medieval High German experienced a systematic sound shift, 
although not consistently reflected in modern standard German, which trans- 
formed /rs/ into /rf/. It is argued that this sound shift must be analysed phoneti- 
cally as a perseverative tongue shape and place of articulation assimilation from 
[rs] to an apical retracted alveolar [rs], followed by the phonological reanalysis 
of a perceptually ambiguous [rs] as /rf/. This analysis implies the hypothesis that 
only alveolar rhotics can trigger s-retraction in /rs/, not uvular rhotics. Approxi- 
matively seven hundred years after the MHG s-retraction, a similar sound shift 
occurs in Flemish varieties of standard Dutch. A pilot study with Flemish speakers 
confirms that the motivation for s-retraction in /rs/ must be attributed to an alveo- 
lar /r/. Further typological inquiries into the historical phoneme inventories of 
English, Dutch and German varieties, reinforced by the comparison with other 
(non-)Germanic languages, also confirm this hypothesis. 


Zusammenfassung: Im Mittelhochdeutschen trat eine Lautverschiebung auf, bei 
der /rs/ zu /rf/ wurde, obwohl sie nicht konsequent im modernen Hochdeutschen 
widerspiegelt ist. Es wird postuliert, dass diese Lautverschiebung phonetisch als 
eine progressive Zungenform- und Artikulationsortassimilation von [rs] zu einem 
apikalen retrahierten alveolaren [rs] zu verstehen ist, wonach ein perzeptuell 
zweideutiges [rs] als /rf/ reanalysiert wurde. Diese Analyse bringt die Hypothese 
mit sich, dass nur ein alveolares /r/ s-Retraktion in /rs/ verursachen kann, nicht 
aber ein uvulares /R/. Etwa siebenhundert Jahre nach der mhd. s-Retraktion tritt 
offensichtlich die gleiche Lautverschiebung in flämischen Umgangssprachen auf. 
Eine Pilotstudie mit Flämischsprachigen bestätigt, dass die Motivierung für 
s-Retraktion in /rs/ einem alveolarem /r/ zuzuschreiben ist. Weitere typologische 
Beobachtungen über die historischen Lautinventare des Englischen, Nieder- 
ländischen und Deutschen, vom Vergleich mit anderen (nicht-)germanischen 
Sprachen ergänzt, bekräftigen diese Hypothese. 


1 Introduction 


The study of sound shift, to be most accurate and reliable, should be equally based 
upon three pillars: phonological theory, phonetic observations and typological 


ð Open Access. © 2020 Kokkelmans, published by De Gruyter. [CVEIEEEN This work is licensed under 
the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. 
https://doi.org/10.1515/9783110668476-008 
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evidence. The risk of doing phonetics without phonology is that of missing the 
system-related reasons behind a specific observation, whilst the risk of doing 
phonology without phonetics is that of losing consideration of the practical 
articulatory reasons which motivate and condition a certain pattern. Finally, 
typology allows to verify phonetic observations and phonological generalisations 
inside the larger frame of natural languages.’ 

Furthermore, a well-balanced study needs not only to rely on diachronic, but 
also on synchronic comparison, since “no linguistic state of affairs can have been 
the case only in the past” (Lass 1997: 28). Synchronic evidence (in this case, from 
regional Flemish standard Dutch) sheds light on similar diachronic develop- 
ments which are too old to have been studied in real time (in this case, Middle 
High German s-retraction in /rs/). 

This paper will thus not only analyse the sound shift from /rs/ to /rf/ in West 
Germanic languages from a synchronic and diachronic perspective, but also 
illustrate more generally how phonology, phonetics and typology can (and 
should) work altogether to understand a specific development in natural lan- 
guages. I first show that s-retraction in /rs/-clusters, at a phonetic articulatory 
level, is only triggered by alveolar rhotics, due to the shape and placement of the 
tongue; then, that at a phonological perceptual level, it consists of the reassign- 
ment of the resulting phonetic output [s]* to a more posterior phoneme category, 
due to acoustic closeness to that category; and I conclude with typological data 
which supports this. 

In the next section (2.1), I describe two similar and independent processes of 
s-retraction in the consonant cluster /rs/: one which spread through late medie- 
val Germany, and one which is active in modern Flemish standard Dutch. I then 
present in 2.2 the existing literature dealing with this sound shift and formulate 
in section 2.3 a hypothesis which accounts better for the empirical observations 
made in 2.1. 

Section 3 contains a theoretical phonetic analysis of this sound shift as a 
perseverative tongue shape and place of articulation assimilation. On the phono- 
logical side, I show how the phonetic output is reattributed to a more posterior 
sibilant phoneme category. 


1 I wish to express my deep gratitude to Anne Breitbarth for her supervision, advice and support, 
and to Torsten Leuschner for his review and encouragement to publish this paper. Thanks also to 
two reviewers for their precious comments and to Heleen Van Mol for telling me literally that 
“certain pershons in Flandersh do it, but I don’t”. 

2 The IPA notation [s] (with a minus sign indicating retractedness) stands for a voiceless retracted 
alveolar sibilant. The additional subscript & indicates apicality. 
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In section 4, I focus on the diachronic and synchronic rhotic and sibilant 
inventories of English, German, Frisian and Dutch to demonstrate the hypothesis 
of this paper by means of cross-linguistic comparison, supported by an acoustic 
study of Flemish Dutch. 

In section 5, Ipresent additional typological data from other languages which 
have or have had a similar sound shift, including e.g. Afrikaans and Basque. I then 
conclude with some remarks on the conservative sibilant inventories of Dutch, 
Frisian and Low German, compared to English and High German innovations. 


2 Problem description and hypothesis 


2.1 Two West Germanic processes of s-retraction in /rs/ 


Starting around the fourteenth century (Hall 2008: 231f.), High German dialects 
experienced a systematic sound shift from /rs/ to /rf/, which reached approxima- 
tively the border with Low German dialects without spreading further northwards. 
This change is reflected, although not systematically, in modern standard German. 
Examples of this are (adapted from Hall 2008: 231, quoting Russ 1982: 77, 1978: 81): 


(1) a. MHGkilrsle >NHG Kilrfle Kirsche “cherry” 
b. MHGhilrs]| > NHG Hi[r/] Hirsch “stag” 
c. MHGvelrs] > NHG Velrs] Vers “verse” 


As a) and b) illustrate, this sound shift affected /s/ both from an original Old High 
German */s/ (compare Dutch kers) and from the deaffrication of the result of the 
OHG consonant shift */t/ > /(t)s/ (compare Dutch hert). Example c) shows that not 
all standard German words were affected, something which is motivated by the 
historical process of heterogenous variant selection in written standard German.? 

Contrary to most modern German varieties, some German dialects still had 
s-retraction in /rs/ as a productive rule in recent times, for example the Swiss 
dialect from Jaun (adapted from Hall 2008: 232, quoting Stucki 1917): 


3 Standard German, originally only a written language, is predominantly based on High German 
but also has Low German loanwords, compare e.g. the doublets Wappen “coat of arms” — Waffen 
“weapons”. Verse in 1c) can thus have been loaned from or influenced by varieties without 
s-retraction (compare French vers, Italian verso). 
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(2) a. Ferse [foe:rfana] “heel” 
ein sauber-es_ [əssu:fərf] “a clean (one)” 
c. gib mir (e)s [gimərf] “give it to me” 


As b) and c) show, the rule is not only productive inside single morphemes, but 
also across morpheme and word boundaries respectively. When functioning 
productively, s-retraction in /rs/ seems to operate independently from morpheme 
and word boundaries, as long as /r/ is immediately followed by /s/. This is not 
the case in modern standard German, where it doesn’t work across morpheme or 
word boundaries (compare e.g. warst [va:(z)st] “you were”, not [va:(s){t]; wieder so 
[vi.də(g).zo] “again so”, not [vi.da(s).30] in the IDS database of spoken German, 
DGD: FOLK_E_00288). It can be concluded that the standard German words with 
/rf/ from /rs/ in fact contain “fossilised” traces of a once-active rule, something 
which is confirmed by its absence in newer loanwords (e.g. Star Wars [sta:5.w3(5) 
s], DGD: FOLK_E_00288). 

This raises the following question: why did the MHG s-retraction in /rs/ 
continue to work in certain dialects, but stopped functioning in others and in 
modern standard German? Another question, raised by Hall (2008: 214) and 
more generally known as the actuation problem (Weinreich/Labov/Herzog 1968: 
102), is: Why did only Middle High German experience this change, and not Mid- 
dle Low German or Old High German? In this paper, I give answers to both ques- 
tions, showing 1) that the phonetic properties of both /r/ and /s/ allow or impede 
s-retraction in /rs/, something which explains why it was maintained as a pro- 
ductive rule only in certain German dialects, and 2) that this sound shift was hin- 
dered in Low German, Frisian and Dutch because of their conservative sibilant 
inventories as compared to High German and English. 

Seven hundred years after the hypothesised start of MHG s-retraction in /rs/, 
evidence from an own study on Flemish Dutch (Kokkelmans 2017) shows that the 
same rule is active in certain Flemish varieties, in which /s/ after /r/ has signifi- 
cantly different phonetic characteristics from [s], being much more [f]-like. More 
precisely, phonetic measurements yield acoustic values halfway between those of 
[s] and [f] (Kokkelmans 2017: 35f.), closest to those of a retracted alveolar [s]. 

In the regional Flemish standard Dutch varieties which have s-retraction in 
/rs/, it is observed to occur inside words as well as across morpheme or word 
boundaries, as demonstrated by the informant named Alveolar4 in the study 
(Kokkelmans 2017): 


(3) a. Uw pelrsloonlijke mening 
“your personal opinion” 
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b. Ande[rzlijds (= ander + zijd- + -s) 
“on the other hand” 


c. Dat er daalr] [zlogezegd 
“(...) that there, so to say (...)” 


Example b) and c) show that in voiced contexts (with underlying /z/), the result 
of the sound shift is [z]. In these Flemish varieties, the /r/ is still phonetically 
realised, meaning that [s] and [z] are contextual allophones of /s/ and /z/ after 
/r/, but in other varieties (e.g. East Limburgish, see 3.2), the /r/ has been vocalised 
or dropped. Despite r-loss, listeners can often reconstruct the pattern as /r/ + /s, z/ 
thanks to grammatical and lexical paradigms (e.g. Noors [no:s] “Norwegian” vs. 
Noorwegen [no:r.we.xa] “Norway”). 

The fact that the informant uses alveolar [r] as the phonetic realisation of /r/ 
is of crucial importance for this study, since I claim that only an alveolar /r/ allows 
s-retraction in /rs/ to start working as a productive rule. In fact, it has been 
claimed in the literature that the motivations of this sound shift are independent 
from the phonetic properties of /r/ and /s/, something which shall be refuted in 
this paper. 


2.2 Existing literature about s-retraction in /rs/ 


Hall (2008: 215) describes the MHG shift from /rs/ to /rf/ as “a sound change that 
has perplexed generations of Germanicists”. According to him, “there has been 
a tradition in the literature of either simply describing [s-retraction in /rs/, JK] 
without explaining [it] or of proposing a superficial explanation that does not 
stand up under closer scrutiny”. He rightly recognises the lack of a comprehensive 
study to understand the motivations of this sound shift, since most mentions of 
s-retraction in /rs/ in various languages (e.g. Eliasson 2000; Ewald 2015; Pedersen 
1895; Schmitt 2015; Torp 2001) are exclusively descriptive and do not formulate 
any hypothesis to explain it. Furthermore, I found no mention of the Flemish 
s-retraction in /rs/ in the literature, despite its contemporality making it especially 
relevant to understand this sound shift. 

To explain why this change happened in MHG and not in OHG or Low German, 
Hall (2008) proposes a phonological analysis based on the distinctiveness of the 
feature [high] (which distinguishes /f/ from /s/) as the key in understanding what 
he considers as a height dissimilation. I will argue instead for a place and tongue 
shape assimilation from [(dento-)alveolar] and [laminal/apical] (features of [s]) 
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to [retracted alveolar] and [apical] (features of [r/r]* and [s]). This analysis as an 
assimilation of the alveolar place of articulation implies the hypothesis that 
s-retraction in /rs/ only arises in languages in which /r/ is realised as an alveolar 
/r/. This contradicts Hall’s (2008: 233) statement that the exact (place of) articu- 
lation of these sounds doesn’t provide the explanation for the sound shift, and 
his assumption that it can also be triggered by a uvular /R/: 


One might hypothesize that the precise articulation of /r/ provides a clue, but Icontend that 
the surface phonetic facts are not important for an understanding of the [rs] > [rf] change. 
Consider the fact that there are dialects like the one spoken in Jaun in which the change 
takes place after an apical /r/, but that there are also dialects in which the change occurs 
after a uvular /R/ [...] It needs to be stressed that it is the phonology of /r/ and /s/ [...] and not 
the phonetics of these segments which explains rs-Dissimilation. Fine-grained surface pho- 
netic facts are not important for the analysis I propose [...]. 


Against the view that s-retraction in /rs/ is not motivated by phonetic properties, 
I will argue and show that “it is mainly the concrete, phonetic properties of 
speech sounds that trigger or allow changes to take place in the sound system, 
and determine their subsequent development” (Chen/Wang 1975: 278). Taking 
phonetics, phonology and typology into account, this paper proposes another 
explanation for s-retraction in /rs/, and by pointing at the similarities between 
MHG and contemporary Flemish, it will account better for its motivations and its 
(non-)occurrence in Germanic languages. 

A similar sound shift, described amongst others by Stausland Johnsen (2012: 
509ff.), is the Scandinavian retroflex assimilation which transforms [s] into a 
postalveolar retroflex [s] after an alveolar [r] or a retroflex postalveolar flap [t]. He 
demonstrates that there has been an intermediate step where an apical alveolar [r] 
transformed a laminal alveolar [s] into an apical alveolar [s], which corroborates 
the hypothesis of a tongue shape assimilation, whilst he doesn’t make any distinc- 
tion in place of articulation between [alveolar] and [retracted alveolar]. Retroflex 
assimilation can likewise be understood as triggered by a place of articulation 
and tongue shape assimilation, but it distinguishes itself from “non-retroflex” 
s-retraction in /rs/ to the extent that it applies to all alveolar consonants, includ- 
ing /t, d, n, 1/, to yield /t, d, n, U. 

Stausland Johnsen (2012: 513-519) shows that although the vast majority of 
Norwegian and Swedish dialects either have alveolar /r/ and retroflexion or uvular 
/R/ and no retroflexion (i.e., retroflex assimilation and uvular /R/ seem to be 


4 I consider the prototypical place of articulation of [r/r] in languages with only one alveolar 
rhotic phoneme to be [retracted alveolar] for reasons detailed in section 3.1. 


Middle High German and modern Flemish s-retraction in /rs/-clusters — 219 


mutually exclusive), two exceptions exist at the border between /r/ and /R/: the 
Frogner dialect had retroflexion and later innovated by replacing /r/ with /R/, 
whilst the Arendal dialect had /R/ and no retroflexion, and “adopted the retro- 
flexion process from neighbouring dialects” (Stausland Johnsen 2012: 508). In 
both dialects, the sound shift is productive, also across morpheme and word 
boundaries. Stausland Johnsen (ibid.: 506) considers retroflexion in these varie- 
ties with /R/ to be an ‘unnatural’ process which “lacks any synchronic phonetic 
motivation”. He shows that although these varieties seem to speak against the 
hypothesis of this paper, they are the historical consequence of intense language 
contact between varieties with /R/ and varieties which still today have productive 
retroflexion. That s-retraction in /rs/ can be learned as a productive rule even when 
its phonetic motivation is absent does not contradict the hypothesis that the pho- 
netic motivation for s-retraction lies in the assimilatory effect of an alveolar /r/ on 
a following /s/; it rather shows that language contact and historical developments 
can make s-retraction in /rs/ synchronically unmotivated, yet still learnable. 


2.3 Hypothesis 


From a theoretical perspective, s-retraction in /rs/ is posited to be a phonetic 

assimilation of place of articulation and tongue position which generates the 

output [rs], and a subsequent phonological reanalysis of this phonetic output as 

a more posterior phoneme category. The hypothesis of this paper can be summa- 

rised as follows: 

- From a phonetic perspective, s-retraction in /rs/ is a perseverative assimila- 
tion of the [apical] tongue shape and [retracted alveolar] place of articulation 
of [r/c]? to a following [(dento-)alveolar] and [laminal/apical]® [s], which 
yields an [apical] [retracted alveolar] [s]. 

- From a phonological perspective, s-retraction in /rs/ is the allocation of this 
[s] in the phonetic output [rs] to a more posterior phoneme category than its 
original category. To be phonological, s-retraction necessarily implies a change 


5 Since the difference between the alveolar trill [r] and the alveolar tap/flap [r] plays no distinctive 
role in this paper, [r] will hereafter refer to both [r] and [c] for economical purposes. The approxi- 
mants [1/1], hereafter likewise written [1], could theoretically also trigger s-retraction in /rs/, but 
this question could not be addressed here due to a lack of data. The retroflex approximant [1] can 
trigger retroflex assimilation (see section 4.1). 

6 The initial tongue shape of [s] is not relevant here, since it becomes [apical] by means of the 
assimilation in any case. Dart (1991: 21, 26) shows that the pronunciation of (dento-)alveolar [s] in 
English and French freely variates between laminal and apical. 
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of phoneme category, e.g. from /rs/ to /rs/, /x[/ or /s/. In two-sibilant invento- 
ries with /s/ and /f/, it consists more precisely ofthe reanalysis of a perceptu- 
ally ambiguous phonetic output [rs] as an underlying /rf/ instead of /rs/. 

- From a typological perspective, s-retraction in /rs/ is a sound shift which 
occurs, phonetically, in languages with a retracted alveolar [r] and a (dento-) 
alveolar [s] in /rs/-clusters, and phonologically, in languages which have a pho- 
neme /s/ [s] and at least another more posterior sibilant phoneme category. 


3 Phonetic and phonological motivations 
for s-retraction in /rs/ 


3.1 Phonetic articulatory theory 


In this section, I explain why I consider the prototypical place of articulation of 
alveolar /r/ and /s/ in languages with only one alveolar /r/ or /s/ respectively to 
be [retracted alveolar] (as evoked in footnote 4), by referring to literature about 
articulatory economy and auditory dispersion. I then detail the articulatory 
motivations for s-retraction in /rs/. 

The consonants [r] and [s] are standardly described in the literature as alveolar 
sounds, along with e.g. /t, d, n, 1/, seldom with very detailed descriptions of their 
precise articulation place (as e.g. dento-alveolar, alveolar, retracted alveolar or 
alveo-palatal). For example, [r] is described as a sound produced when “the tongue 
blade and tip move up to the dental-alveolar-prepalatal region” (Barry 1997: 36, 
my emphasis). The wide range of described places of articulation is due to the 
wide range of different productions of these sounds: according to Ladefoged/ 
Maddieson (1996: 221), realisations of [r] “vary across speakers and languages in 
the location of the contact on the upper surface”. Boyce (2015: 261) shows for the 
English /r/ that “although many textbooks refer to /r/ as having an alveolar place 
of articulation, it is more accurate to say that it has a relatively undefined ‘palatal’ 
or ‘post-alveolar’ primary place of articulation”. This variation is explained by 
the phenomenon of permissible variation: “a category is allowed more auditory 
variation if it is alone on its auditory continuum than if it has neighbours from 
which it has to stay distinct” (Boersma/Hamann 2008: 222). This holds of course 
not only for perception, but also for production. For example, a speaker of a 
language with one single sibilant will be free to realise it as e.g. alveolar, retracted 
alveolar or palato-alveolar, but a speaker of a language with two sibilants will 
need to keep them apart (e.g. alveolar vs. palato-alveolar) in order to avoid con- 
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fusion. This is described as auditory dispersion in Boersma/Hamann (2008). 
However, at the same time, articulatory economy tends to favour realisations 
which require less tongue movements from its average or rest position (Boersma/ 
Hamann 2008: 220f., 230; Shariatmadari 2006), such as retracted alveolar sibi- 
lants. Adams (1975: 290), Vijünas (2010: 42) and Martinet (1955: 236f.) confirm 
typologically that languages with only one sibilant tend to have it realised as a 
retracted alveolar, with average (acoustic) centre of gravity-values’ on a scale 
from front (dental [s], highest COG) to back (postalveolar retroflex subapical [s], 
lowest COG). Articulatory economy thus favours retracted alveolar rhotics and 
sibilants from an absolute point of view (i.e. considering the sound itself, inde- 
pendently from its phonetic or phonological environment). Languages with two 
sibilants, on the contrary, tend to have a dispersed contrast with [s] and [f], 
pronounced far enough from each other — as (dento-)alveolar and palato- 
alveolar, respectively, with less permissible variation and strictly distinct centres 
of gravity. This also holds for coronal rhotics: Russian, for example, has a con- 
trast between a palatalised dental trill [r] and a postalveolar trill [r] (Ladefoged/ 
Maddieson 1996: 221). This thus means that the place of articulation of coronal 
rhotics and sibilants will tend towards [retracted alveolar] when they are the sole 
phoneme of their class, and towards the extremities of their articulatory-auditory 
spectrum ([(dento-)alveolar] and [postalveolar]) when they contrast in place of 
articulation with another coronal rhotic or sibilant respectively.® 

Concerning the apical-laminal distinction, Ladefoged/Maddieson (ibid.: 218) 
explain that trill and tap/flap [r]’s are predominantly apical: 


Trills are much more easily produced if the vibrating articulator has relatively small mass, 
hence the most common trills involve the tongue tip vibrating against a contact point in the 
dental/alveolar region [...]. In fact by far the most common type of trill is one involving the 
tongue tip. 


7 The centre of gravity (COG) of a sound is obtained by “weighing the frequencies in the spectrum 
by their power densities” (Boersma/Hamann 2008: 229). 

8 When it comes to the statistical distribution of rhotics and sibilants in the languages of the 
world, a notable difference is found: amongst the languages which have at least one rhotic, it is 
most common (75.3%) for a language to have exactly one rhotic, whilst only 18.9% of all languages 
have more than one (Wiese 2011: 713, based on UPSID). In particular, own calculations in the 
UPSID yield that, of languages with coronal trills or tap/flaps, 89.8% have just one. In contrast, 
only 52.6% of the languages with sibilants have exactly one, with 41.7% of all languages having 
more than one (UPSID). One is thus overall very likely to find languages with a retracted alveolar 
rhotic, and relatively likely to find languages with a retracted alveolar rhotic and a (dento-)alve- 
olar sibilant. 


222 —— Joachim Kokkelmans 


The only exception to this tendency known to Ladefoged/Maddieson (1996: 228) is 
the Czech fricative laminal alveolar <f> /r/, which however contrasts with a trilled 
apical alveolar /r/. In a language with one [r], it is thus predicted to be [apical] and 
[retracted alveolar]; in a language with two coronal rhotics, there will be at least 
one [apical] of both, either [retracted alveolar] as in Czech or [postalveolar] as in 
Russian. 

Regarding sibilants, concrete phonetic observations made by Dart (1991: 22, 
29, 32) show that there is a correlation between the configuration of the tongue 
and the backness of /t, d, s, z, n, 1/ in French and English: “the more apical the 
articulation, the farther back on the palate it is articulated” (Dart ibid.: 22). A 
striking exception to this tendency is however found in English apical /s/ and 
/z/, which are consequently articulated further forward. Dart (ibid.: 29) explains 
it as follows: 


The explanation for this difference may lie in the fact, as stated by Catford (1977: 157), that 
the acoustic and aerodynamic differences between apical and laminal /s/ are more evident 
if they are alveolar than if they are dental. A retracted apical fricative opens up a large 
sublingual resonance cavity, which is characteristic of [f] production and would presuma- 
bly cause the /s/ to trespass on the acoustic space of this contrasting segment, an effect 
naturally to be avoided. 


Dart (1991) and Catford (1977) thus make the crucial observation that /s/, which 
is (dento-)alveolar in well-dispersed inventories with two sibilants such as Eng- 
lish, is prone to auditory misinterpretation as /f/ when itis apical and (retracted) 
alveolar. This explains phonetically the tendency of [s] to be reanalysed as /f/ 
(see next section). Stausland Johnsen (2012: 512), citing Anderson (1997), con- 
firms independently that “there is an observed tendency for listeners [...] to 
asymmetrically misperceive apical alveolars as apical postalveolars”. Fronting 
the English apical /s/ to a dental place of articulation can be considered a strat- 
egy to prevent such a misperception. 

As said above, articulatory economy favours retracted alveolar rhotics 
and sibilants from an absolute point of view. From a relative point of view (i.e. 
in a given environment, here “after a retracted alveolar [r]”), articulatory 
economy also favours [s] in [rs] to become [s], since it requires fewer move- 
ments of the tongue: instead of needing to move from [retracted alveolar] to 
[(dento-)alveolar] and (optionally) from [apical] to [laminal], the tongue keeps 
the same position and shape throughout the entire articulation of the cluster 
(i.e. [retracted alveolar], [apical]). Pragmatic evidence for this articulatory 
ease is the observation that both sounds can be pronounced simultaneously 
(sounding somewhat like the Czech fricative trill <f> [r]), whilst [r] and [s] 
cannot. 
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3.2 Phonological perceptual theory 


How [s], the output of phonetic s-retraction in /rs/, will be interpreted by the 
phonological system of a language depends on the sibilant inventory itself: how 
many sibilant phonemes (contrasting in place of articulation) it has, and to 
which mean phonetic realisation they correspond. A first prerequisite for phono- 
logical s-retraction in /rs/ is that there must be at least two phonetically distinct 
realisations, namely a (dento-)alveolar [s] and a retracted alveolar [s] after /r/. In 
one-sibilant inventories with a retracted alveolar [s] as prototypical realisation of 
/s/, s-retraction cannot occur because /s/ after /r/ is acoustically undistinguishable 
from another /s/, and no phonemic split or reanalysis as another phoneme can 
thus take place. A second prerequisite for phonological s-retraction in /rs/ is that 
/s/ after /r/ must be reassigned to a more posterior phoneme category, either by 
provoking a phonemic split or undergoing a reanalysis. In the first case, [s] becomes 
a phoneme of its own and contrasts with /s/, something which implies the loss of 
the conditioning context (i.e. r-loss).? If however [s] fails to establish itself as a new 
phoneme, it will be reallocated to the phoneme category whose mean phonetic 
realisation is perceptually closest to [s]. In a three-sibilant inventory, this will be the 
middle phoneme category (e.g. /s/ [s] for Basque). In a two-sibilant inventory with 
a well-dispersed /s/-/f/-contrast, the phonetic output [s] is prone to reanalysis as 
/f/, as explained above by Dart (1991: 29) and confirmed typologically by Adams 
(1975) and Kwakkel (2008: 4). This is precisely what happened in Middle High Ger- 
man. The phonological motivation for s-retraction in /rs/ is thus the fact that [s] is 
perceptually biased towards being perceived as a more posterior phoneme cate- 
gory than its original category, and is therefore reallocated to that category. 


4 Typological comparison of West Germanic 
sibilant and rhotic inventories 
This section compares the phonetic realisations of the phonemes /r/, /s/ and /f/ 


in West Germanic varieties, from a diachronic and synchronic perspective. Whilst 
these languages all have /r/ as a phoneme, they mainly differ in having either a 


9 No clear-cut example of this is known to me for the s-retraction described here, but it occurred 
as a result of the Swedish retroflex assimilation (see section 5), where /s/ did not merge with 
what is now /¢/ and /fj/; and it could be the case for Afrikaans (Ewald 2015: 38), where /r/ before 
[s] “is almost inaudible” and [s] thus almost phonemic. 
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sole /s/ (in Dutch, Frisian and Low German) or in having a contrast between 
(dento-)alveolar /s/ and postalveolar /f/ (in English and German). The West Ger- 
manic varieties described here have in total three different sibilants and three 
main realisations of /r/, as depicted in table I (NB: regardless of voicing). 


Table 1: Attested combinations of rhotic consonants and sibilants in phonetic inventories of 
West Germanic languages 


Retracted alveolar [s] (Dento-)Alveolar [s] Postalveolar [f] 


Alveolar trill [r] or tap/ Some Netherlandic Some Flemish varieties 
flap [r] Dutch varieties 


Scottish English, Middle High German, 
some southern German varieties 


Uvular trill [R] or fricative [k] Some Netherlandic Some Flemish varieties 
Dutch varieties 


Dutch Limburgish, modern standard German 


Alveolar or retroflex Some Netherlandic Standard English 
approximant [4] or [1] Dutch varieties 


In the original Proto-Germanic sibilant inventory, there was most probably one 
/s/, for all modern occurrences of /f/ not found in loanwords result from conso- 
nant mutations and correspond to original Germanic clusters (e.g. /sk/, /sj/). 
Considering the observation made in 2.1 that one-sibilant inventories tend to 
have a retracted sibilant, this sole /s/ must have been realised as [s], exactly as the 
modern pronunciation in the more conservative Germanic daughter languages 
(e.g. Icelandic, see Vijünas 2010: 45). 

The original phonetic realisation of /r/ is rather controversial in the literature, 
but considering the occurrence of [r] in Germanic languages which have been less 
exposed to direct or indirect French-speaking influence, it most probably corre- 
sponds to the original PG */r/ (see Trautmann (1880) and Chambers/Trudgill 
(1998), who consider the uvular /R/ a French innovation). 

The following sections describe, for three groups of West Germanic varie- 
ties (English, High German and Dutch/Frisian/Middle Low German), first the 
development of /s/ and /f/, then the development of /r/, and subsequently why 
s-retraction in /rs/ did or did not happen in that group of languages. This shall 
allow us to test the hypothesis enounced above in detail on the West Germanic 
group, before extending the typological scope to North Germanic and non- 
Germanic languages in section 5. 
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4.1 Middle English and modern English 


English is most likely the earliest Germanic language to have developed the sibi- 
lant contrast /s/ - /f/, by means of a palatal assimilation in /sk/-clusters which 
created /f/. Although the time of appearance of /f/ is (not undisputedly) estimated 
to be around the late Old English period (Cercignani 1983: 323), it must certainly 
have existed as a phoneme in the eleventh century, when a retracted [s] in Norman 
or Middle French loanwords was reanalysed as /f/ (Adams 1975: 283f.). Examples 
of such loanwords are push (Fr. pousser), leash (Fr. laisse) and punishment (Middle 
French punissement). 

In most English varieties, /r/ mutated to the alveolar approximant [1].*° In 
some varieties, amongst which American and Irish English, it is pronounced as a 
retroflex approximant [J]. In varieties of Scottish and South African English, it is 
pronounced [r]. Northern Northumbria had a uvular /R/, which has become very 
rare if not almost extinct (Ogden 2009: 90-93), but has been in use “for at least 
the last 300 years” (Maguire 2017: 88). 

In Middle English, be it in varieties which already had an approximant or in 
varieties which retained the trill or tap/flap, I found no traces of s-retraction in /rs/. 

As Ball/Rahilly (1999: 56) point out concerning the retroflex approximant [1], 
“some speakers use a retroflex approximant as their realization of ‘r’, and if they 
pronounce final ‘r’ in words like ‘reader’ and then add a plural ending to this, the 
final sound may well be [z]”. Retroflex assimilation is thus observed in rhotic 
varieties of English with a retroflex approximant, amongst which Scots and Scot- 
tish English. In those varieties, it is ascribed to Gaelic influence (Maguire 2012: 63), 
who mentions the occurrence of “retroflex approximant pronunciations of /r/, 
with retraction of following alveolars (as in horse [ha1s])”. Retroflex assimilation 
can be found in a range of Northern English dialects as wide as from Yorkshire 
(Hedevind 1967: 73) and Cumbria (Cathcart 2012: 80f.) to Orkney (Schmitt 
2015: 67f.) and the Hebrides (Cathcart 2012: 81), and occurs with all “following 
alveolar consonants” (Maguire 2012: 63). 

In his short description of retroflex consonants in English, Orton (1939) 
notes that Northumbrian dialects have retroflex /t, d, s, z n, V for the clusters / 
rt, rd, rs, rz, rn, rl/, and that the uvular /R/ is no longer pronounced in this con- 
text but seems to have “left its mark upon adjacent sounds” in the speech of 
“many people from all parts of the British Isles, especially Scotland” (Orton 


10 Little is known about the precise period of the shift to an approximant, which has occurred at 
a different pace according to phonotactic context (e.g. [r] still being preserved in some clusters 
such as /thr/) and geographically (e.g. not having reached parts of Scotland). 
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1939: 40). This allows the interpretation that Northumbrian dialects could have 
had productive s-retraction in /rs/ with a uvular /R/. However, considering the 
occurrence of retroflex and non-retroflex s-retraction in /rs/ (see next para- 
graph) in the neighbouring dialects with [q] and [r] respectively, the Northum- 
brian retroflexes might have been relics of an earlier retroflex assimilation with 
a coronal rhotic, or a consequence of language contact. Wells (1982: 370) cites 
Orton (1939) but gives as example of the backing of central vowels before a (now 
disappeared) Northumbrian uvular /R/ the word first [fo:st] (Wells 1982: 374f.), 
not [fo:st], which even speaks against any s-retraction in /rs/ in Northumbrian 
dialects. 

Some Scottish varieties of English with trilled [r] also have s-retraction in /rs/ 
as an active rule, inside words as well as across morpheme or word boundaries. 
For example, s-retraction after [r] can be heard distinctly in a dialect interview 
from the Isle of Skye, yielding e.g. of courshe or speakersh (BBC Voices, at 0:52 and 
1:09). The speaker produces non-retracted non-sibilants after /r/ (e.g. words 
[w3ts] at 0:58), unlike other speakers with retroflex assimilation (e.g. start [sta:t] 
at 0:03), which underlines the difference between both phenomena despite their 
occurrence in the same region. 

S-retraction with an approximant /1/ is also found in the clusters /str/, /skr/, 
/spr/ and /sr/ in English varieties of the United States, New Zealand and the 
United Kingdom (Baker/Archangeli/Mielke 2011; Stevens/Harrington 2016). Baker/ 
Archangeli/Mielke (2011: 359) note however that “/stı/ strongly favors a bunched /1/ 
over a retroflex /1/ and all of the subjects in [their] study produced a bunched /1/ 
variant [...] in /stı/ clusters”. It thus seems that s-retraction in /str/ is triggered by 
a (post-)alveolar affricated [ti/ti-] rather than by retroflexion, which makes it 
more similar to “non-retroflex” s-retraction. Rutter (2011: 31) mentions as a possible 
phonetic trigger the fact that “both /1/ and/f/ share a tongue position that is further 
back than /s/”, thus evoking an assimilation of place of articulation in /str/. 


4.2 Middle High German and modern German 


Starting from a pre-Old High German inventory with a sole /s/, the second Ger- 
manic consonant shift appeared around the sixth century in Upper German and 
spread gradually to Central German dialects (Stedje 2007: 75), transforming 
amongst others the PG */t/ into the affricate [ts]. It was at first realised as an 
affricate in all positions (i.e. also after vowels, contrarily to modern German, see 
Lange 2007: 17), which means that Old High and Early Middle High German /s/ 
continued to be the sole member of its sibilant inventory, since the original PG 
*/sk/ had not been assimilated to [f] yet (similarly to the English palatalisation 
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of /sk/). Old High and Early Middle High German thus had a sole /s/, which must 
have been a retracted alveolar [s].1 

The phonetic realisation of /r/ was probably still identical to the PG trill *[r] 
in almost the entire German-speaking area up to the 18th century (see the map in 
Wiese 2011: 718), except for a small area in Southwest Austria which is claimed to 
have developed uvular /R/ autonomously in the Middle Ages (Bisiada 2009: 89-91). 

Phonological s-retraction did not occur in Early Middle High German as long as 
/§/ [f] did not exist as a phoneme. Judging from MHG orthography, s-retraction did 
probably not occur phonetically or phonologically, since [s] after /r/ was written <z> 
and not <s> (compare hirz in Engelien 1892: 46). In Middle High German, around 
the eleventh century (Hall 2008: 217), /sk/ became phonemicised as /f/ in all posi- 
tions, creating a contrast with the original /s/ ([s]). Before the twelfth century as 
well, [ts] had become deaffricated to [s] after vowels, which made all three sibilants 
contrast with each other, [s] being rendered in the orthography as <z>, [s] as <s>, 
and [f] as <sch> (Hall 2008: 218). The one-sibilant inventory of Early Middle High 
German thus became a three-sibilant inventory, with a contrast between [s], [s] 
and [f] (Adams 1975: 284; Benware 1996: 266f.). In the twelfth century, [s] merged 
with [s] to [s] (Hall 2008: 218), leading to a two-sibilant inventory consisting of /s/ 
[s] and /f/ [f], as in modern German. The conditions necessary for phonological 
s-retraction to /rf/ were reunited, and s-retraction in /rs/ occurred (ibid.: 231f.). 

This provides an answer to the actuation problem: s-retraction in /rs/ did not 
occur phonologically in Low German varieties, because they still had one single 
/s/ pronounced [s]. Dialects located north of the boundary of the /t/ > /(t)s/ 


11 Additional evidence supporting this view are Transylvanian place names, which demonstrate 
that the first German settlers who migrated from the Moselle and Rhine region to Transylvania in 
the 12th century still had a retracted [s] for /s/. Rosenau has been reanalysed in Romanian (which 
already had a contrast with [s] and [f]) as Räsnov ([rifnav]), and in Hungarian (which also has this 
contrast) both as (Barca)rozsnyo ([ro3n9]) and Rosznyö ([rasno:]) (Siebenbuerger.de: Rosenau). It 
must thus have been pronounced [s] or [z] in Early Middle Central German to have been interpret- 
ed as [f], [3] and [s] in Romanian and Hungarian, something which is cross-linguistically attested 
for the transfer of [s] and [z] into languages with a [s]-[[] contrast (Adams 1975). The same reanal- 
ysis is observed in other Transylvanian place names, e.g. Klausenburg (Rom. Cluj, Hun. Kolozsvár) 
(Siebenbuerger.de: Klausenburg). An [s] from PG */t/ consistently corresponds to Romanian [s], 
e.g. Weißkirch — Viscri (Siebenbuerger.de: Deutsch-Weisskirch). The merger of <s> [s] and <z> [s] 
as well as s-retraction in /rs/ occurred later as witnessed by place names, e.g. Donnerschmuert 
(Ger. Donnersmarkt), Hamerschderf (Hammersdorf) and Mäterschdref (Mettersdorf) (Sieben- 
buerger.de). 

12 They had not assimilated /sk/-clusters to /f/ (compare e.g. [sk] in modern Afrikaans, meaning 
that dialects from South Holland in the 17th century still had unassimilated /sk/) and they had 
no [ts] to deaffricate at all (for not having taken part in the High German consonant shift). 
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shift and south of the boundary of the /sk/ > /f/ shift (i.e. dialects which have dat 
and Schule, e.g. in East Limburg) developed a contrast between the original 
/s/ > [s] and the assimilated /sk/ > [f]. In these dialects, s-retraction in /rs/ could 
also happen, and did happen (see next paragraph). 

Evidence demonstrating that only [r] and not uvular /R/ triggers s-retraction 
is given by different outputs in different diachronic lexical layers in the dialect of 
Eys (Dutch Limburg). The modern dialect uses uvular /R/ in all positions, and 
exactly as in standard German, s-retraction in /rs/ is no longer productive. Yet, in 
words which are old and common enough to belong to the vocabulary of Limbur- 
gish farmers of the 17th century, a historical and standard Dutch or German /rs/ 
corresponds to /(ə)f/ (where /r/ vocalised to /a/ in stressed syllables and disap- 
peared in unstressed ones). Newer or “higher educated” words exhibit no trace of 
s-retraction, but contain [Rs] instead. 


(4) a. Eysdialectveelf]l” Ge. Ferse En. heel 
b. Ey.polflelei NI. porselein En. porcelain 
c. Ey. ange[3]6m Nl. andersom En. the other way around 
d. Ey. get angelf] NI. iets anders En. something else 
e. Ey. get bete[rs] NI. iets beters En. something better 
f. Ey. Noo[rs] Nl. Noors En. Norwegian 
g. Ey.peelf] NI. pers En. (fruit) press 
h. Ey. pe[Rs] NI. pers En. (written) press 


The arrival of uvular /R/ in the language can be dated to after the borrowing of 
posjelei “porcelain” from French or Dutch, but before the borrowing of pers 
“written press” from standard Dutch. Uvular /R/ impeded s-retraction in /rs/ to 
continue working in the Eys dialect, as can be observed in newer words and 
morphosyntactic constructions. In e.g. the Swiss dialect of Jaun, however (Stucki 
1917; Hall 2008: 232), /r/ is realised as [r] and therefore, s-retraction is still active 
across morpheme and word boundaries. 


4.3 Middle Low German and modern Dutch and Frisian 


As detailed in the previous section, Middle Low German had a sole /s/ in its 
sibilant inventory, which was a retracted one (Adams 1975: 289). This retracted 


13 Data provided by a native speaker of the Eys dialect. 
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[s] has also historically been the variant used in the Low Countries until recent 
times, and it still is in Frisian and some Netherlandic varieties (compare the 
assumption made in Boersma/Hamann (2008: 230) that Dutch /s/ is [s]). These 
varieties with retracted [s] are found mostly in e.g. Zeeland, Holland and Fries- 
land (i.e. in the inner Low Countries, far from the border with French and Central 
German), whilst (dento-)alveolar [s] is found closer to the border with French 
and Central German, mainly in Flanders and Limburg.“ In all varieties, the 
phoneme /f/ is present in loanwords (e.g. China). Some West Flemish and Lim- 
burgish dialects have [f] in /sk/-clusters (Goossens/Taeldeman/Verleyen (eds.) 
2000: 19), but the other varieties have unassimilated /sk/-clusters ([sk] or [sx]). 
Some varieties such as West Flemish, Netherlandic Dutch and Frisian assimi- 
late clusters of /sj/ to [¢] (e.g. meisje “girl”). This doesn’t prevent the Netherlan- 
dic Dutch and Frisian varieties to keep /s/ pronounced as [s], coexisting with [e] 
in a poorly dispersed sibilant inventory (Boersma/Hamann 2008: 230, 254). To 
sum up, Netherlandic Dutch and Frisian have sibilant inventories consisting of 
one retracted alveolar sibilant coexisting with [e] in /sj/ and [f] in loanwords; 
some West Flemish and Limburgish varieties have both [s] and [f]; and other 
Flemish varieties have [s] and [f] in loanwords only. 

The phoneme /r/ in the Low Countries exhibits an even higher diversity of 
realisations, since “almost all variants of /r/ found in the languages of the world 
[...] are observed in the Dutch language area” (Verstraeten/Van de Velde 2001: 45). 
The original alveolar trill or tap/flap [r] is rivalled by the more recent approximant 
[1] and uvular /R/. A glance into the phonological atlas of Dutch dialects (Goos- 
sens/Taeldeman/Verleyen (eds.) 2000: 357) shows that uvular /R/ is well-estab- 
lished along the Rhine and Meuse region, in Limburg and Brabant, as well as in 
cities (e.g. Ghent), spreading from one centre to another (Verstraeten/Van de 
Velde 2001: 46). The approximant is gaining ground in the Netherlands, and has 
become known as the Gooise R. 

As explained in 2.2, in the varieties with a retracted [s], s-retraction in the 
phonological sense cannot occur because all occurrences of [s] are interpreted 
as /s/ regardless. Nevertheless, in Dutch varieties with [r] and [s], but not in 
those with uvular /R/ and [s], phonetic s-retraction in /rs/ occurs produc- 


14 An acoustic analysis of /s/ in sound files of the Nederlandse Dialectenbank (van Oosten- 
dorp 2014) indeed shows that /s/ is more retracted in Dutch regions centred around Holland and 
the north (tested in Ouddorp, Zwanenburg and Hallum, with a mean centre of gravity of 3,784 
Hz), and in the speech of older people in the Low Countries. A fronted [s] is found in regions 
closer to French and German (tested in Tongeren, Elingen and Clairmarais, mean COG: 4,628 Hz), 
and in the speech of younger people. 
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tively, as described in the next section. Due to a lack of data, nothing can be 
said here about Netherlandic Dutch varieties with an approximant /r/ and [s]. 
Nevertheless, Plug/Ogden (2003) show that in those varieties, /t/ and /d/ are 
pronounced as apical alveolars after an approximant /r/, whilst they are gen- 
erally laminal in other contexts (Scobbie/Sebregts 2010). This suggests that 
the approximant could exert a similar influence on a following /s/, making it 
apical and thus prone to reanalysis as a posterior phoneme category (see 
section 3.1). 


4.3.1 A phonetic study of Flemish s-retraction in /rs/ 


To quantify the observed Flemish s-retraction in /rs/, Kokkelmans (2017) proposes 
a pilot study which compares /s/ after /r/ to /s/ in other contexts and to /f/ in 
loanwords (including their voiced counterparts), in regional varieties of standard 
Dutch spoken by Flemish students. Six informants, with a homogeneous back- 
ground in order to make results comparable and unperturbed by the potential 
influence of additional factors (native Flemish men between 18 and 30 years old, 
with a higher education background), were recorded reading the same text, and 
subsequently holding a colloquial conversation with the interviewer. Three of 
them are from East Flanders, one is from West Flanders, one from Brabant and 
one from the province of Antwerp. Four of them use an alveolar trill /r/ in usual 
speech, two use a uvular /R/, and all pronounce /s/ as a (dento-)alveolar [s]. They 
are anonymously labelled Alveolar1, Alveolar2, Uvular1, Uvular2 etc. according to 
their own place of articulation for /r/, i.e. either [r] or /R/. All speak regional 
varieties of standard Dutch, with a moderate extent of regional or local features. 
No participant knew what the experiment was about, with the purpose of guaran- 
teeing a spontaneous speaker input. 

Calculations were made using Praat (Boersma/Weenink 2005) on the acoustic 
characteristics of sibilants contained in a total number of 64 different words: 


(5) - 37 words containing the cluster /rs/ 

- 13 containing its voiced equivalent /rz/ 

- 3 containing /s/ in another context (e.g. steden “cities”) 

- 3 containing /z/ in another context (e.g. zouden “should”) 

- 3 containing /f/ (e.g. pêche “peach”, loanword from French) 

- 3 containing /3/ (e.g. journal in English, with the exclusion of the preced- 
ing /d/) 

— 2 Dutch words containing /sj/ (nationale and sociale), to test if speakers 
realise it as [e] or [sj] 
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These 64 words were all to be read in the same (con)text by the 6 informants, so 
that no different phonetic context could interfere. All 374 sibilant realisations” 
were manually extracted in Praat, and an automated Praat script calculated the 
COG of the extract to determine “how sh-like” the sibilant is. As Ladefoged 
(2001: 56) explains, [s] is characterised by “a large amount of energy in the upper 
part”, “comparatively little energy below 3,500 Hz, and a noticeable intense 
band above 5,000 Hz”. In turn, [f] “has more energy at a slightly lower frequency, 
centered at a little above 3,000 Hz”. Similarly, the difference between voiced [z] 
and [3] consists of a lower concentration of energy for the latter. 


As zouden COG: 3969 Hz Ad moolerzouden COG: 2576 Hz A4 journal COG: 2923 Hz 


100%] 100 100% 


OnZ 10.000 Hz On 10.000 Hz 0 10.000 Hz 


Fig. 1: Graphs representing the mean intensity (vertical axis) in function of the frequency 
(horizontal axis), for /z/ in “zouden” and “mooier zouden”, and /3/ in “journal” (from left to 
right), produced by informant Alveolar4. The grey part corresponds to all frequencies below or 
equal to 3,500 Hz. 


As an example, the three graphs in figure 1 reveal the striking difference between /z/ 
in zouden “should” when following an alveolar /r/ in the previous word, and /z/ in 
zouden when not preceded by [r]. Despite the word boundaries, s-retraction oper- 
ates in this example to such an extent that /z/ after [r] is “even more [3]” than the 
/3/ in journal pronounced by the same speaker. 

The results for all 64 words of the study are illustrated in figure 2. As figure 2 
shows, informants with alveolar /r/ have lower COG values (i.e. more “sh-like” pro- 
nunciations) for /s/ and /z/ when preceded by /r/ than in other contexts, but they 
are generally not as low as those of /f/ and /3/. On a scale from the mean /f/ and /3/ 


15 Out of the 384 predicted occurrences (64 words x 6 informants), 10 occurrences were excluded 
because the informant mispronounced the word or used another pronunciation, e.g. reading the 
sign “%” as procent instead of percent. 
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(3,185 and 3,055 Hz, respectively) to the mean /s/ and /z/ (4,933 and 3,742 Hz), the 
mean /s/ and /z/ after /r/ are located at 4,287 and 3,057 Hz respectively, i.e. 45.34% 
“s-ness” if the mean /f/ and /3/ correspond to 0% and the mean /s/ and /z/ to 
100%. 

The two informants with uvular /R/ do not exhibit active s-retraction in /rs/. On 
a scale from the mean /f/ and /3/ (2,988 and 2,473 Hz) to the mean /s/ and /z/ (4,113 
and 3,465 Hz), the mean /s/ and /z/ after /R/ are located at 4,781 and 2,991 Hz, i.e. 
109.16% if the mean /f/ and /3/ correspond to 0% and the mean /s/ and /z/ to 100%. 

An inquiry into individual mean values shows that s-retraction is the strongest 
in the speech of Alveolar1 (-22.55%), followed by Alveolar3 (46.85%), Alveolar4 
(62.52%) and Alveolar2 (65.64%). Uvularl (86.68%) and Uvular2 (121.63%) do not 
seem to have s-retraction as an active rule. 


8000 Alveolar 


0 
l l l l I, l | 
rs rz s sch sj Zz zh 
7000 
Uvular 
6000 
5000 e 


I l l | 
rs rz s sch sj z zh 


Fig. 2: Boxplot representation of the mean COG values for speakers with alveolar or uvular /r/ in 
Kokkelmans (2017) according to phoneme and phonetic context, generated using the statistical 
software JASP (JASP Team 2018). <sch> stands for /f/ and <zh> for /3/. Dots represent outliers 
(i.e. exceptionally diverging values). 
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A striking observation is that the informants who have the lowest mean COG 
values for /sj/ (indicating s-retraction to /¢/, compare 3.3) are alveolar speakers 
with active s-retraction in /rs/, namely Alveolar! (07.36% for /sj/) and Alveolar3 
(56.04% for /sj/). On the contrary, Uvular2 (80.75%), Alveolar4 (98.60%), Alveo- 
lar2 (99.68%) and Uvular1 (148.72%) do not productively retract /sj/ towards [el]. 
This could mean that s-retraction in /rs/ favours s-retraction in /sj/ (and perhaps 
also the other way around). This pattern resembles the one observed in the MHG 
s-retraction after /r/ and before consonants, with s-retraction in a specific context 
(e.g. before /p/) triggering s-retraction in other contexts (e.g. before /n/), i.e. “drag- 
ging” other clusters towards /{/ (described as “processual change” in Benware 
1996: 268, 275). One can thus conclude that these Flemish varieties with alveolar 
[r] exhibit the same phenomenon of productive s-retraction in /rs/-clusters as 
Middle High German. 


5 Final observations and conclusion: Answering 
the actuation problem 


The cross-linguistic part of this study has focused in detail on West Germanic lan- 
guages. However, other languages also show traces of once-productive s-retraction 
in /rs/, or have it as a productive rule. 

In the Germanic family, a large group of Faroese, Swedish and Norwegian 
varieties have active retroflex assimilation in /rs/-clusters to [s], also across 
morpheme and word boundaries (Eliasson 2000: 40f.). As explained in sections 
2.2 and 4.1, this kind of s-retraction is different from the one described here, in 
the sense that it applies to all alveolar consonants preceded by /r/ (also /t, d, n, 
1/). Exactly as for “non-retroflex” s-retraction, retroflex assimilation and uvular 
/R/ are mutually exclusive in Scandinavia (Torp 2001), despite their co-occurrence 
in Frogner and Arendal, as detailed in section 2.2. Torp (2001: 82) remarks con- 
cerning this exception to the complementary distribution of /R/ and retroflexion 
that we “have to reckon with some special kind of dialect contact in order for the 
dorsal /r/ to penetrate into a dialect area with retroflexes”. 

10,000 kilometres further south, s-retraction in /rs/ also occurs productively 
in Afrikaans, according to recent phonetic analyses (Ewald 2015; Wissing/Pien- 
aar/Van Niekerk 2015) which found /s/ to be realised after the alveolar [r] “as a 
voiceless retracted alveolar sibilant [s]” (Ewald 2015: 35) instead of [s]. This must 
be an independent rather than inherited phonological development, since Afri- 
kaans is mainly based on dialects from southern Holland (Kloeke 1950), which 
still today have a retracted alveolar [s]. 
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S-retraction in /rs/ also happened in Indo-European satem languages such 
as Balto-Slavic and Indo-Iranian, in contexts in which /s/ was preceded by /r/, 
/u/, /k/ or /i/ (known as the ruki-rule, see Pedersen 1895: 53f., 74). In this case, 
s-retraction in /rs/ only started to operate phonologically after the /s/-/f/ con- 
trast appeared (from the assibilation of the PIE palatovelar series */k, é, $"/), 
triggered by an /r/ pronounced (most probably) as an alveolar trill or tap/flap 
(which is still the pronunciation of /r/ in almost all their daughter languages). 

Outside the Indo-European language family, Western Basque varieties are 
found to have a single <s> (apical [s]) corresponding to historical [rs]-clusters, 
whereas Eastern varieties have retained <rz> ([rs]), as e.g. in uso [u.so] vs. urzo [ur. 
so], “pigeon, dove” (Trask 1996: 77). This example of s-retraction in /rs/ with r-drop 
perfectly corroborates the hypothesis of a retracted alveolar place of articulation 
and apical tongue shape assimilation, since the phonetic output [s] is preserved 
exactly as such in the phoneme /s/ which exists in the language, and not reana- 
lysed as the laminal /[/ <x> which also exists in Basque’s three-sibilant inven- 
tory. In Western and Central Basque, /r/ surfaces as an alveolar trill or tap/flap, 
unlike in the French (i.e. Eastern) Basque country, where it has partly become a 
uvular /R/ (McColl Millar/Trask 2015: 285). 

The typologically recurring pattern observed here confirms the hypothesis 
of this paper: s-retraction in /rs/-clusters only arises phonetically in languages 
with an alveolar /r/ and a (dento-)alveolar /s/, and phonologically in invento- 
ries with an /s/ [s] and at least one other more posterior category. This provides 
an answer to the actuation problem: by knowing which phonetic properties and 
which phonological inventories allow s-retraction in /rs/ to start operating, one 
can explain why it does or does not occur productively in different languages. 

In the context of the general shift, in IE languages, from a one-sibilant inven- 
tory with [s] (Adams 1975: 290) to one with several contrasting sibilants, Low 
German, Dutch and Frisian have shown themselves to be the most conservative 
West Germanic varieties by keeping their sole retracted [s] up to recent times. 
English first lead the innovation of phonemicising /f/, and later experienced 
s-retraction in /rs/ in its varieties with [r] and [7]. At the other end of the West 
Germanic family, High German phonemicised /f/ from the same /sk/-cluster 
and also underwent s-retraction, which did not reach as far north as northern 
Germany and the Low Countries because of their conservative sibilant inventory. 
It has thus been shown that as soon as the required conditions are met, all Ger- 
manic languages start to undergo s-retraction in /rs/, something which sheds light 
on how related languages experience parallel developments in the same direction 
despite being separated from each other (compare e.g. Flemish and Afrikaans). On 
the other hand, I found no trace of s-retraction in /rs/ at all in Romance languages. 
This leads to the crucial observation that although both groups arose from a variety 
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which could not and did not experience s-retraction in /rs/, namely Proto-Germanic 
and Latin, only the members of the former language group later developed in that 
direction regardless of geographical barriers. This raises the question to which 
extent the potentiality of s-retraction in /rs/ was already “contained in the gram- 
mar” of Proto-Germanic and not in that of Latin. This question opens possibilities 
for further investigation in the domain of the underlying phonological grammars, 
to address the questions which remain with respect to the actuation problem. 
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Part 3: Psycholinguistic Perspectives 


Leah S. Bauke (Wuppertal) 

The role of verb-second word order for 

L1 German, Dutch and Norwegian L2 English 
learners: a grammar competition analysis 


Abstract: This paper investigates the role of verb-second (V2) word order in 
second language acquisition. It addresses the question whether a first language 
(L1) that is a V2 language affects acquisition of a second language (L2) that is not 
a V2 language. To test this, the paper investigates the interpretation of wh-+parti- 
cle constructions of L2 English speakers with L1 German, Dutch and Norwegian. 
The experiment shows a V2 effect in L2 English for these speakers, which, however, 
is not uniform in terms of amplitude across all speaker groups and thus strongly 
suggests that there is no generalized V2-parameter. 


Zusammenfassung: In diesem Beitrag wird die Rolle von V2 (Verbzweitstellung) 
im Zweitspracherwerb untersucht. Im Mittelpunkt steht dabei die Frage, ob eine 
Erstsprache (L1), die eine V2-Sprache ist, den Erwerb einer Zweitsprache (L2), 
die keine V2-Sprache ist, beeinflusst. Um dies zu testen, untersucht das Papier die 
Interpretation von wh-+Partikel-Konstruktionen von L2 Sprechern des Englischen, 
deren L1 Deutsch, Niederlandisch oder Norwegisch ist, und zeigt, dass in der L2 
Englisch dieser Sprecher ein V2-Effekt nachgewiesen werden kann. Allerdings ist 
dieser Effekt fiir die unterschiedlichen Sprechergruppen unterschiedlich hoch, 
was nahelesgt, dass es keinen generalisierten V2-Parameter gibt. 


1 Introduction 


German, Dutch and Norwegian are V2 languages. English may have had some V2 
properties in its earlier stages but is no longer regarded as a V2 language today. 
However, it still has some residual V2 properties, for instance in questions (Rizzi 
1990) and quotation contexts (Roeper 1999, 2016), and is thus sometimes charac- 
terized as a residual V2 language (Rizzi 1990). This paper investigates in how far 
an L1 that is a V2 language influences the acquisition of an L2 that is a non-V2/ 
residual V2 language. In other words, we address the question in how far V2 can 
be unlearned in second language acquisition (SLA). It will be shown that V2 
characteristics in L2 English of L1 speakers of German, Dutch and Norwegian are 
real, testable and quantifiable and very persistent even in advanced L2 speakers. 
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It will also be shown that the ‘V2 effect’ is stronger in L1 speakers of German than 
in L1 speakers of Dutch and almost non-existent in L1 speakers of Norwegian. 
These results challenge the idea of simple parameter resetting in SLA because 
unlearning V2, i.e. a simple switch from V2 in L1 to non-V2 in L2, would entail 
that L1 speakers of V2 languages show uniform characteristics in L2 English. We 
argue for a grammar competition/multiple grammar (GC/MG) analysis instead 
(Kroch 1989; Kroch/Taylor 1997, 2000; Roeper 1999; Yang 2003), which shares 
the assumptions of the full access/full transfer (FA/FT) hypothesis of Schwartz/ 
Sprouse (1996), but argues against parameter resetting in SLA. Section 2 provides 
an introduction into GC/MG theory. Section 3 discusses the role of GC/MG theory 
in SLA in more detail and section 4 provides experimental data for GC/MG in 
wh-+particle constructions in L2 English. Section 5 concludes the paper. 


2 GC/MG theory in diachronic linguistics 
and L1 acquisition 


Grammar competition theory was first introduced in the context of diachronic 
linguistics (e.g. Kroch 1989; Kroch/Taylor 2000; Pintzuk 1999). Here Old English, 
which is traditionally classified as a language in which the object comes before the 
verb (OV language) (e.g. van Kemenade 1987; Kiparsky 1995; Fischer et al. 2000), is 
shown to also feature VO orderings in embedded clauses. These alternative 
orderings are accounted for under the assumption that two rules exist side by 
side in the grammar of Old English. One that generates the ‘traditional’ OV order 
and one that generates the verb-object (VO) order. Thus, the two rules exist within 
a single grammar and provide radically different and potential contradicting 
structures. This idea is further developed in Pintzuk (1999)! under the name 
Double Base Hypothesis (DBH). According to the DBH the two word orders 
attested in Old English can be ascribed to the presence vs. absence of verb 
movement from V to a higher functional projection and the availability of head- 
initial and head-final structures in both projections. The higher projection may be 
T/Infl (Tense or Inflection) (Pintzuk 1999) or some lower head, e.g. v (Roberts 1997; 
Fuß/Trips 2002). Either way, two radically different structures can be derived. 
Roeper (1999) argues that GC effects can also be found in language acquisition 
and that this can be regarded as another indication of the frequently stated paral- 


1 Pintzuk presents her analysis in the generative framework of the theory of principles and 
parameters (see e.g. Chomsky 1981 and subsequent publications). 
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lelism between language acquisition and diachronic language development. In 
those contexts where a language exhibits seemingly contradicting properties that 
can only be derived by two conflicting rules, grammar competition is at work. 
Thus, in early L1 acquisition when children simultaneously produce the structures 
in (1a) and (1b) grammar competition is involved (ibid): 


(1) a. Him want... 
b. He wants... 


Here, Roeper (1999) argues, the child switches between a minimal default gram- 
mar that derives the structure in (1a) with default Case-assignment on the pronoun 
and no agreement marker on the verb and the structure in (1b) where nominative 
Case is assigned structurally and where the verb shows agreement morphology. 
Eventually, the child will progress from the minimal default grammar to the 
grammar that generates the structure in (1b) but this, according to Roeper (1999), 
does not involve parameter resetting nor deletion of the minimal default gram- 
mar that generates (1a)?. In this context, too, two seemingly contradicting rules 
thus exist side by side, of which one is productive and the other is non-produc- 
tive and possibly deactivated. In more recent publications this idea is extended 
to contexts of SLA (Roeper 2003, 2016; Amaral/Roeper 2014) and the term multi- 
ple grammars rather than grammar competition is used. For the example from 
Old English and for language acquisition illustrated in (1), GC/MG can be 
regarded as an unstable state in which two rules are in competition within one 
and the same language. The unstable state of Old English is resolved to a stable 
condition in Early Middle English, where VO is the predominant order. In lan- 
guage acquisition the grammar that derives the structure in (1b) is the stable 
state at the end of the acquisition path. This, however, need not be the case. 
Particularly in acquisition contexts the competition between two conflicting 
rules can remain a stable state at the end of the acquisition path, or in other 
words, the acquisition path can be marked by two seemingly optional and con- 
flicting rules. For further clarification let us look at another example from lan- 
guage acquisition in some more detail. 

According to Chomsky (2005), three factors determine language acquisition. 
The first is genetic endowment, i.e. the basic principles of Universal Grammar 
(UG). The second is experience, i.e. empirical data that the child is confronted with 


2 Note, incidentally, that simple deactivation — instead of full deletion - of the minimal default 
grammar in (1a) allows older children or adult speakers to tap into this grammar to understand 
and mimic baby-talk (Amaral/Roeper 2014). 
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in the acquisition process. Based on these data the child determines a limited 
number of parametric choices provided by UG. The third and final factor encom- 
passes several cognitive principles that are language (and potentially even 
organism) independent, i.e. general learning principles that are not language 
specific. From the minimalist perspective the main research question in recent 
years has been how much can be attributed to the third factor and how much 
remains language specific (Chomsky 2005 and subsequent publications). Although 
a definitive answer has not yet emerged, it is clear that language acquisition can- 
not do without genetic predisposition, i.e. some version of UG, and primary lin- 
guistic data, i.e. language input (e.g. Chomsky 2001, 2005 and subsequent publi- 
cations). This paper adopts these assumptions and argues for the existence of UG 
with some narrowly confined parametric options that are determined by language 
input. More specifically we argue that MG/GC theory exploits these assumptions in 
the sense that language acquisition is constrained only by UG, i.e. all languages that 
conform to UG are in principle acquirable, while, on the other hand, language learn- 
ers will never postulate any rules that are not in line with UG. The learner of any 
language X will thus acquire the rules of this language which are a subset of the 
possible rules of UG. These rules, once acquired, cannot be modified nor replaced 
or deleted during the acquisition process. The only option for the language learner 
is to add new rules to the grammar of their language if required by linguistic evi- 
dence. In this scenario, seemingly contradicting rules can very easily arise. In 
these cases, the learner needs to determine which rules are productive in their 
language and which are lexically or contextually (or otherwise) constrained. Unu- 
sual though this might seem at first sight, it has immediate empirical and theoret- 
ical appeal and is strictly in line with the assumption that there is no (real) option- 
ality in language (Chomsky 1995). Hence, real optionality in language never arises 
and is always attributable to independent and potentially contradicting rules of 
grammar. In line with standard minimalist assumptions, we will further assume 
that these rules of grammar are maximally simple. How maximal simplicity is to 
be defined or measured exactly is still open to debate in minimalist theory and we 
will not provide an answer here (but see Chomsky 1957 for an early debate). It is 
equally unclear what a rule of grammar is and we will also leave this open for fur- 
ther research. However, we will follow standard assumptions in minimalist theory 
that rules operate over features. Thus, the exact determination of features that 
lead to syntactic rules will henceforth be left open, while it is assumed that fea- 
tures are the only necessary and sufficient factor for rule formation and that no 
other devices such as indices, diacritics or features of features are required (for 
discussion, see inter alia Adger 2010; Boeckx 2010; Reuland 2011; Bauke 2014; 
Amaral/Roeper 2014). Now let us discuss how this theory accounts for the postu- 
lation of seemingly contradictory rules as a stable state in language acquisition. 
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Any child acquiring English will be faced with the question whether English 
requires the overt realization of subjects or not. Both options are available in UG 
and linguistic data will eventually lead the child to the postulation of a rule that 
requires the overt realization of subjects.? For lack of a better representation we 
will pre-theoretically represent this rule as follows (cf. also Amaral/Roeper 2014 
for a similar representation): 


(2) [Subj/: +phon] 


However, the constructions in (3) (adapted from Amaral/Roeper 2014), which are 
all frequently attested in English, are all incompatible with the rule in (2): 


(3) a. Where’s Khatia? just left. 
b. cold today, isn’t it. 
c. sounds all right to me. 


Thus, all speakers of English must also have a rule in their grammar that gener- 
ates the structures in (3) and this rule is likely to contrast with the rule in (2). In 
principle, there are two options for integrating the rule for the data in (3) into the 
grammar. The first is to augment, i.e. change, the rule in (2) in the following way’: 


(4)  [Subj/: +phon; iff Subj = Topic: -phon] 


This, however, would mean that the rule in (2) is manipulated and changed in the 
acquisition process. This is an option that MG/GC does not allow and it is theoreti- 
cally unclear how editing an existing rule can be accounted for in language acquisi- 
tion. The second option is to add a new rule to the grammar of English. This is in line 
with requirements from MG/GC theory and we thus suggest the informal rule in (5): 


(5) [Subj = Topic: -phon] 


This rule conflicts with, but does not change, the rule in (2) in any way. As a 
result, the grammar of English now holds both rules, i.e. (2) and (5), and this is 


3 We will not go into the exact nature of the acquisition path here (see for instance Hyams 2014 
for discussion in FLA contexts and Rothman/Slabakova 2017 for discussion in SLA contexts) 

4 Again, we use this rather informal representation for lack of a more concise alternative ex- 
pressed simply by features and we also remain silent on the question whether an analysis of (3) 
in terms of topic drop is ultimately correct. 
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what marks a stable state in the grammar of English that allows for the generation 
of sentences with and without overt subjects. When comparing the rules in (2) 
and (5), we can clearly see that the apparent optionality of sentences without 
subjects in English, as in (3), boils down to specific contexts. 


3 GC/MG theory in SLA 


Provided GC/MG theory isreal and attested in diachronic contexts and in contexts 
of first language acquisition, it is relevant to ask whether it also exists in SLA 
contexts. The assumption that the final state of L1 is the initial state of L2 
(Schwartz/Sprouse 1994, 1996; White 2003 and many others) is in line with GC/ 
MG theory.’ The rules of L1 will thus provide the starting point for L2 acquisition. 
Whenever these rules can also account for the L2 data no further learning is 
required. This will only occur in contexts where the L2 data are not compatible 
with the already acquired rules.° Note that this does not necessarily mean that the 
use of L1 rules in L2 acquisition leads to target language rules. The interlanguage 
rules of the L2 learner may still be radically different from the native language rules 
of the L1 learner (cf. Schwartz/Sprouse 1994, 1996 for further details) but if they are 
not in conflict with surface representations, the L1 grammar will be applied to the 
L2 without further modification. So far, there is thus no difference between GC/MG 
theory and some version of the full-access/full-transfer model. The only difference 


5 We will therefore limit our explorations henceforth to the interaction of GC/MG with full- 
access/full-transfer. This is, of course, by no means the only theory of SLA, and it is impossible 
to do justice to the full spectrum of (generative) approaches to SLA here. For a comprehensive 
and accessible overview of these theories we refer the reader to Rothman/Slabakova (2017). 

6 An anonymous reviewer asks in this context how much input is required to trigger the postu- 
lation of an additional rule. This is a highly relevant but also very intricate question that we 
cannot fully address in this paper. It is well-known that L1 acquisition is marked by a certain 
amount of conservatism (e.g. Roeper 2016), which means that the child starts with conservative 
and narrow hypotheses and is somewhat reluctant to generalize to a general V2 grammar e.g. only 
on the basis of auxiliary raising in English. Only the full spectrum of constructions, e.g. main 
verb raising, auxiliary raising, object fronting, PP-fronting, etc. in German leads to a full-fledged 
V2 grammar. The situation may, however, be different for L2 acquisition - possibly depending 
on the nature of the L1 - where full access/full-transfer may lead to a situation in which the L2 
learner tests a rule of maximum generality and moves to more specific additional rules only on 
the basis of evidence that is incompatible with the maximal rule. In any case, we would argue 
that to the extent that frequency plays a role here it is type- rather than token-frequency that 
drives the learning process (see also Yang 2008 for more sophisticated arguments in the same 
direction). 
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between the two approaches is that in those cases where further learning is required 
GC/MG theory assumes the addition of new and simple rules only confined by UG 
to the grammar. Whereas under full-access/full-transfer restructuring and/or 
parameter resetting is assumed. Thus, in GC/MG theory the strict absence of any 
kind of rule restructuring that is postulated for L1 acquisition is also maintained 
in L2 acquisition contexts. Let us consider an example: we have seen above that 
an L1 speaker of English will have the rules in (2) and (5) in their grammar. Now 
let us assume that this speaker acquires Spanish as an L2. Neither of the rules in 
(2) and (5) can account for the Spanish data. Thus, further learning is required 
and again two options are available. Under the first option either of the two L1 
rules above could be restructured in such a way that it also allows for the L2 data. 
This, however, raises the question how the learner decides which of the two rules 
are restructured and whether this impacts the L1 in any way. In GC/MG theory, by 
contrast, these questions do not arise because here all that needs to be added to 
the grammar is a new (simple) rule: 


(6) [Subj/: phon] 


This rule readily accounts for the Spanish data. It conflicts with the rule in (2) 
just as much as the rule in (5) does and we already argued with regard to (5) that 
such a conflict is unproblematic and indeed desirable in L1 contexts where 
seemingly optional data need to be accounted for. Adding new rules is therefore 
unlikely to cause problems in SLA. Yet, what is needed for SLA is information on 
which rule belongs to which language (Amaral/Roeper 2014). This information is 
probably encoded in the feature geometry of the relevant rules. Hence, the gram- 
mar of an L1 English/L2 Spanish speaker will eventually accommodate the rules 
in (2) and (5) for English and the rule in (6) for Spanish, repeated below for 
convenience: 


(2) [Subj/: +phon] 
(5) [Subj = Topic: -phon] 
(6) [Subj/: phon] 


There is an obvious conflict and contradiction between the rules in (2) and (6). Yet 
neither is restructured or modified in any way. On the contrary, both rules are part 
of the grammar of an L1 English/L2 Spanish speaker and both are used produc- 
tively by the speaker. 

With this much in place, let us now turn to another case of GC/MG and the 
impact of a V2 rule for L1 speakers of German, Dutch and Norwegian acquiring L2 
English. 
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4 V2 in German, Dutch and Norwegian 
speakers of L2 English 


4.1 Word order in German, Dutch and Norwegian 
vs. English main clauses 


V2 word order is typical of Germanic languages and German, Dutch and Norwegian 
display this word order pattern very robustly in main clauses’. The examples in 
(7)-(9) illustrate for each language respectively that various kinds of constituents 
can occur in sentence initial position (7a/8a/9a—7d/8d/9d) as long as the verb is in 
second position (7e/Se/9e-7g/8g/9g): 


(7) a. Gestern tanzte dieStaatsanwältin mit dem Professor. 
Yesterday danced the state attorney with the professor. 
‘Yesterday the state attorney danced with the professor’. 

Die Staatsanwältin tanzte gestern mit dem Professor. 

Mit dem Professor tanzte die Staatsanwältin gestern. 


*Gestern die Staatsanwältin tanzte mit dem Professor. 
*Die Staatsanwältin mit dem Professor tanzte gestern. 


gme aos 


(8) a. Gisteren danste de officier van justitie met de professor. 
Yesterday danced the state attorney with the professor. 
‘Yesterday the state attorney danced with the professor’. 


b. De officier van justitie danste gisteren met de Professor. 
c. Samen met de professor danste gisteren de officier van justitie. 
Ge. sis 
e. *Gisteren de officier van justitie danste met de professor. 
f. *De officier van justitie met de professor danste gisteren. 
(9) a. Igår danset statsadvokaten med professoren. 


Yesterday danced the state attorney with the professor. 
‘Yesterday the state attorney danced with the professor’. 
b. Statsadvokaten danset i går med professoren. 


7 This paper has nothing to say on embedded V2 in German nor on embedded non-V2 orders in 
Norwegian dialects. 
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c. Med professoren danset i gar statsadvokaten. 
de ass 

e. *I gar statsadvokaten danset med professoren. 
f. *I går med professore danset statsadvokaten. 
g. 


According to standard analyses (cf. Grewendorf/Hamm/Sternefeld 1987) the V2 
structure is generated by movement of the verb from V-final to C-head position. 
Here we will remain agnostic as to whether the verb passes through an intermediate 
T-head position or not (cf. Haider 2010) as this is orthogonal to the analysis. Nor 
will we take a stand on whether V° and possibly T° are head-initial or head-final 
(cf. Kayne 1994), i.e. whether head-directionality can be reversed or whether all 
structures are asymmetrical and universally head-initial. All that matters is that 
the verb in main clauses moves to C° and thereby generates the surface V2 order, 
with just one specifier position in CP to be filled by a maximal projection. Thus, 
we will assume the standard analysis by Grewendorf/Hamm/Sternefeld (1987) in 
the following slightly revised and updated version: 


(10) cp 
ae 
C 
gers 
Cc TP 
TE 
T 
| n 
VP T 
Pr 
SUB) Vv 
OB) V 


We argue that (10) is representative of German, Dutch and Norwegian main 
clauses, where the verb moves, as indicated, from V to T to C. This is called Finitum- 
voranstellung, i.e. fronting of the finite verb (Grewendorf/Hamm/Sternefeld 1987). 
This operation on its own generates the verb-first (V1) structure that characterizes 
polar interrogatives in all three languages: 


(11) Tanzte die Staatsanwältin gestern mit dem Professor? 
(12) Danste de officier van justitie gisteren samen met de professor? 


(13) Danset statsadvokaten i gar med professoren? 
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In a second step towards generating V2 order, another constituent is topicalized, 
i.e. moved to the specifier position of CP (Spec, CP), and the examples in (7)-(9) 
illustrate that the topicalized constituent can either be a subject, an object oran 
adverbial. In fact, with a few exceptions (i.e. reflexives, anaphoric accusative 
pronouns and some particles (Grewendorf/Hamm/Sternefeld 1987) almost any 
constituent can be topicalized in the three languages). Thus, it can be concluded 
that the rule that generates V2 in German, Dutch and Norwegian main clauses is 
a very robust characteristic of the L1 grammars of speakers of these languages. 

English, on the other hand, is not a V2 language, as illustrated by the example 
in (14): 


(14) Yesterday the state attorney danced with the professor. 


Here the verb is in third position preceded by two other constituents, viz. the 
adverbial yesterday and the subject DP the state attorney. This is because the verb 
does not move to C° in English, hence yesterday may occupy the Spec, CP position 
and the subject DP may occupy the specifier position of TP below and the verb, 
which has (arguably) moved as high as T°, is in third position. With this much in 
place let us now turn to wh-constructions in L2 English and let us define the 
learning paths for all three L1 speaker groups separately. 


4.2 Wh-constructions 


The learning task for any L1 speaker of German, Dutch and Norwegian acquiring 
L2 English is to ‘unlearn’ the V2 properties of the respective L1. However, the 
learning task is complicated by the fact that English shows some residual V2 
properties (Rizzi 1990). In wh-constructions in English auxiliaries move from T to 
C and thus mirror the movement of auxiliaries and main verbs in main clauses in 
German, Dutch and Norwegian, where the verb also ends up in C° and arguably 
moves through T° along the way. Of course, this movement operation does not 
occur with main verbs (any more - it still was a commonplace in Early Modern 
English but is lost in Present Day English) and is limited to auxiliaries but still, it 
is a persistent characteristic of wh-constructions, as is illustrated in the following 
examples for subject and object questions: 


(15) a. Who will kiss Khatia? 
b. Who will [p whe wit kiss Khatia]? 
(16) a. Who will Ted kiss? 


b. Who will [p Ted wit kiss whe]? 
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It is this dichotomy between generalized verb movement in German, Dutch and 
Norwegian on the one hand and residual verb movement in English on the other 
that is explored further in this paper. According to GC/MG theory, a competition 
between the generalized verb movement operation in the L1 and the residual 
verb movement operation in L2 is expected. For German, Dutch and Norwegian 
learners of L2 English partial ‘unlearning’ of the V2 movement rule is required. 
According to MG/GC theory, where full or partial restructuring is not an option, 
this means that an additional rule is added to the grammar that accounts for 
residual V2, i.e. movement of auxiliaries only, from T° to C° in L2 English. In the 
remainder of our study, we probe into the interpretation of wh-+particle con- 
structions of the target learner groups. 


4.2.1 L1 German - L2 English 
(17) illustrates the relevant wh-+particle constructions for German and English. 


(17) a. Who picked Ann up? 
b. Wer holte Anna ab? 


So, in the German example in (17b) the verb has moved to C° and the Spec, CP posi- 
tion is filled by the wh-constituent. In the English example in (17a) on the other 
hand, the verb has not moved to C°, as this movement operation is restricted to aux- 
iliaries. Nevertheless, Roeper (2016), referring to work by Rankin’, reports that L1 
German speakers tend to assign non-target interpretations to the question in (17a), 
which is then either interpreted as ambiguous between a subject- and an object 
question or as an object question only. According to Roeper, L1 German learners of 
L2 English tend to apply the L1 grammar rule in their L2 interlanguage because 
there is no immediate evidence of any conflict. Any potential evidence could only 
come from an alternative construction that is available in English but illicit in Ger- 
man, i.e. the one in which the particle is pied-pied to sentence medial position: 


(18) a. Who picked up Ann? 
b. *Wer holte ab/abholte Anna? 


8 For instance, Rankin (2014) shows that L2 English speakers with L1 German interpret wh-sub- 
ject questions (without particles) as ambiguous or as object questions, particularly when the 
wh-word is [+ animate]. [- animate] constructions are much more consistently interpreted as 
subject questions only. We took these results into account in our study and tested [+ animate] 
constructions only. 
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So, while the data in (18a) clearly signal for the L2 English speaker that the L1 
grammar cannot be applied in these contexts, the data in (17a) do not provide any 
such evidence and a competition from the L1 grammar is very likely, leading the 
L2 English speaker to optionally assign an object interpretation to the question in 
(17a). Of course, the target language grammar does not provide this option and 
requires do-insertion: 


(19) a. Who(m) did Ann pick up? 
b. Wen holte Anna ab? 


In German do-insertion is not required and the difference between the subject 
and the object question is marked by Case-assignment on the wh-constituent.? 
So we argue that for L2 English speakers with L1 German there is a representa- 
tional conflict for the construction in (17a) that originates in GC/MG between a 
generalized V2 rule in the L1 and a residual V2 rule in the L2 and leads speakers 
to assign an ambiguous or even an object interpretation to the subject question. 
We further argue that this competition is even stronger in those contexts where 
the morphological form of the wh-constituent does not bias speakers towards a 
subject interpretation. The relevant data is illustrated in (20) below: 


(20) a. Which one picked Ann up? 
b. Which one picked up Ann? 


Here again, we expect that the construction in (20a) leads to grammar competition 
between generalized V2 in L1 and residual V2 in L2, while the construction in 
(20b), where the particle is pied-piped again signals that the L2 grammar must be 
used because particle pied-piping is ungrammatical in the L1. 


4.2.2 L1 Dutch - L2 English 
At first sight German and Dutch seem to pattern alike and we would not expect 


any major differences in the L2 learning path for the two L1 speaker groups. Both 
languages display solid V2 properties and verb particle constructions have a very 


9 Arguably English also marks the distinction between subject- and object wh-pronouns morpho- 
logically in who vs. whom but this distinction is not realized systematically and it is absent in other 
wh-constituents, cf. e.g. what. 
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similar distribution in both languages, as is illustrated in the following examples 
(cf. Neeleman/Weerman 1993 for examples from Dutch): 


(21) a. Jan belt het meisje op. 


Jan calls the girl up. 
‘Jan phones the girl.’ 

b. Jan ruft das Madchen an. 
Jan calls the girl up. 
‘Jan phones the girl.’ 


(22) a. dat Jan het meisje opbelt 


that Jan thegirl up-phones 
‘that Jan calls the girl’ 

b. dass Jan dasMädchen anruft 
that Jan thegirl up-phones 
“that Jan calls the girl’ 


In questions, on the other hand, we find one factor that distinguishes Dutch 
from German and eventually likens Dutch and English grammars in a relevant 
way. As we saw above, German, in contrast to English, makes a morphological 
distinction between subject and object wh-pronouns. Further relevant examples 
are provided in (23): 


(23) a. Wer liebt Anna? 
Who.NOM loves Anna 
“Who loves Anna?’ 


b. Wen liebt Anna? 
Who(m).ACC loves Anna 
‘Whom does Anna love?’ 


Now let’s compare this to Dutch: 


(24) a. Wie ziet Anna? 
Who.NOM sees Anna 
‘Who sees Anna?’ 


b. Wie ziet Anna? 
Who(m).ACC sees Anna 
‘Whom does Anna see?’ 
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As we can see from the examples in (24a/b) Dutch, unlike German, does not mark 
the distinction between subject and object (nominative and accusative) wh- 
pronouns morphologically. Dutch and English are thus almost indistinguishable 
in this respect, with English marking the distinction only optionally (cf. footnote 
9). Hence L1 Dutch L2 English speakers are much more used to syncretism on 
wh-constituents than L1 German L2 English speakers and we can expect L1 Dutch 
speakers to be able to resolve morphological ambiguity resulting from syncretic 
wh-constituents much more easily than L1 German speakers — possibly by making 
use of a universal subject bias that can be observed cross-linguistically (Culbertson/ 
Kirby 2016). 

This then leaves the question of particle placement in Dutch and again, we 
can observe a crucial difference between Dutch and German and an effect where 
particle placement and syncretic Case-marking on pronouns conspire. This is 
illustrated in the following examples in (25): 


(25) a. Wie houdt van Jan? 
Who.NOM loves Jan 

‘Who loves Jan?’ 
b. Van wie houdt Jan? 
Whom.ACC loves Jan 


‘Whom does Jan love?’ 


So, in (25a/b) the morphological form of the wh-pronoun remains syncretic, just 
like in the examples (24a/b) above. In contrast to the examples in (24a/b), how- 
ever, there is a way in Dutch to disambiguate between a subject and an object 
reading for the wh-pronoun in wh-+particle constructions. Crucially, this disam- 
biguation goes along with moving the particle to sentence initial position. 
Pied-piping the particle no further than sentence medial position, i.e. to the 
position that is grammatical in English and ungrammatical in German (cf. 
examples in section 3.2.1 above), yields a subject interpretation for the wh- 
pronoun in Dutch.’° 


10 Notice incidentally, that stranding the particle in sentence final position is not an option for 
the sample sentences in (25) in Dutch: 

(25)’ *Wie houdt Jan van? 

Interestingly, it is grammatical in Frisian though (many thanks to Arjen Versloot for pointing this 
out): 

(25)” Wa hâldt Jan fan? 

We will leave a more detailed investigation of this interesting observation to future research. 
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4.2.3 L1 Norwegian - L2 English 


Given that Norwegian is also a V2 language, as shown in section 4.1 above, we 
again expect to find a grammar competition effect in wh-+particle constructions 
for L1 Norwegian and L2 English speakers!!. Again, however, we have to take 
some further aspects of Norwegian syntax into account. The examples in (26) 
(taken form Äfarli 1985) illustrate that particle pied-piping is licit in the same con- 
texts in Norwegian as in English. This marks a clear difference from the particle 
placement options for German and Dutch respectively, which were illustrated in 
sections 4.2.1 and 4.2.2 above: 


(26) a. Jon sparka hunden ut. 
John kicked dogs out 
‘John kicked the dogs out.’ 


b. Jon sparka ut hunden. 
John kicked out the dogs 
“John kicked the dogs out.’ 


Furthermore, Norwegian wh-constituents are ambiguous between a subject and 
an object interpretation, as is illustrated below. This makes these constructions 
reminiscent of the syncretic wh-constructions that we find in Dutch but it is 
clearly different from German, where morphological marking allows for a clear 
distinction between subject and object questions: 


(27) Hvem elsker Lasse? 
Who loves Lasse 
a. ‘Who loves Lasse?’ 
b. ‘Who does Lasse love?’ 


However, in Norwegian the ambiguity vanishes in wh-+particle constructions, in 
the sense that those constructions in which the particle is pied-piped can only be 
interpreted as subject-questions. Here, then, the Norwegian and the English 
grammar conspire towards a subject-only reading. The ambiguity remains when 
the particle is stranded, as is shown in (28): 


11 See also Westergaard (2003) for arguments that V2 is hard to unlearn in English. 


256 — Leah S. Bauke 


(28) Hvem kastet hundene ut? 
Who kicked dogs out 
a. ‘Who kicked the dogs out?’ 
b. ‘Whom did the dogs kick out?’ 


(29) Hvem kastet ut hundene? 
Who kicked out the dogs 
“Who kicked the dogs out’ 


*Whom did the dogs kick out?’ 


For those L2 English wh-+particle constructions in which the particle is pied- 
piped, we thus expect a strong tendency towards subject-interpretations. For those 
constructions where the particle is stranded, we expect a higher tendency towards 
ambiguous or object-interpretations of the wh-+particle constructions, unless 
this is again counter-balanced by a universal subject bias. Another potential factor 
is that the construction gets a very strong subject reading in several Norwegian 
dialects even with the particle stranded. Consider the following data from North- 
ern Norwegian: 


(30) Kvem kasta han Ole ut? 
Who kicked he Ole out 
“Who kicked Ole out?’ 


The object reading is preferably generated in a cleft-construction, in order to 
disambiguate it from the subject reading: 


(31) Kvem va det som kasta han Ole ut? 
Who was that SOM kicked he Ole out 
“Who was it that Ole did kick out?’ 


The same patterning can be found in the southwestern Rogaland dialect and in 
Bergensk (spoken in Bergen, where the data were collected). Such disambigua- 
tion by clefting which can be found in several dialects, may well have a GC/MG 
effect on the interpretation of English wh-+particle constructions in which the 
particle is stranded and push L2 speakers towards a target-like interpretation. 

A final factor that needs to be taken into account comes from the English 
grammar itself. We argued above that the GC/MG effect should decrease for L1 
German speakers if the particle is pied-piped because this option is not available 
in their L1 and thus is a clear signal to use the English grammar in these contexts. 
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For L1 Norwegian speakers the logic can be reversed in the sense that the con- 
struction with particle stranding in (30) remains ambiguous, at least in Bokmäl 
and in some dialects. This would again suggest a higher tendency towards ambig- 
uous or object-interpretations. Note, however, that in English the object question 
would require do-support and for L1 Norwegian speakers the lack of do-support, 
combined with the two other factors discussed above, might be enough to prevent 
an object reading. Note further that this signal would not be enough for the Ger- 
man speakers, because here the verb is always in V2 position and the particle is 
always stranded: 


32) a. Wer trifft Hans an? 
Who hits John at? 
“Who meets John?’ 


b. Wen trifft Hans an? 
Who hits John at? 
“Who does John meet?’ 


(33) a. Wer hat Hans angetroffen? 
Who has John at.hit 
“Who has met John?’ 


b. Wen hat Hans angetroffen? 
Who has John at.hit 
‘Who has John met?’ 


So even though Norwegian is a V2 language like German and Dutch and unlike 
English, we would expect a lower grammar competition effect in L1 Norwegian 
L2 English speakers than for L1 German/Dutch L2 English speakers. This, we 
would argue, is not due to the fact that there is no GC/MG between the two 
languages. Rather, it stems from the fact that applying Norwegian grammar in 
English (in line with full-access/full-transfer) does not cause any interference 
effect precisely because the two grammars are not in conflict here. 


4.3 Testing GC/MG in German, Dutch and Norwegian 
speakers of L2 English 


Based on the observations in the preceding sections, we can now formulate the 
following hypothesis: 
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(34) Hypothesis: role V2 in MG/GC of L1 V2 and L2 English speakers: 
a. L1 V2 speakers with L2 English are much more likely to assign an 
ambiguous interpretation to unambiguous subject-questions than L1 
English native speakers (and speakers of a L1 that is not V2) 
b. The effect of an ambiguous interpretation is strongest in contexts 
where neither Case-marking on the wh-word nor particle pied-piping 
is present. 


Furthermore, we predict the following language-specific effects for German, 
Dutch and Norwegian speakers: 


(35) a. For L1 German speakers we predict the highest effect for those con- 
structions where the particle is in sentence final position and the wh- 
constituent is ambiguous between a subject and an object reading. 

b. For L1 Dutch speakers we predict the effect of the absence of Case- 
marking on the wh-constituent to be lower than for L1 German 
speakers. 

c. For L1 Norwegian speakers we predict target-like interpretations for 
those constructions where the particle is pied-piped and a possibly 
very low effect for those constructions where the particle is stranded. 


Both (34) and (35) are addressed in a series of studies that are presented in the 
following sections. 


5 Study 1: L1 German, Dutch or Norwegian 
L2 English 


5.1 Participants 


We tested four groups, three experimental groups for the three different L1 V2 
languages and one control group of L1 English speakers.” The first group 
encompassed 185 students enrolled in the BA study program of the Department 
for English and American Studies at the University of Wuppertal (Germany). 166 
of these participants identified German as (one of) their native language(s). We did 


12 Cf. also Bauke (2019) for discussion of the L1 German data. 
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not distinguish between monolingual and bilingual L1 speakers but those partici- 
pants who did not identify German as their mother tongue were excluded from the 
statistical analysis.” Students were aged between 18 and 28 years (M = 21.37, 
SD = 2.104) and they started learning English between 6 and 10 years of age 
(M = 8.53; SD = 1.071). Their proficiency level ranges between C1 and C2in CEFR 
(measured by an in-house ‘placement test’ that all students entering the program 
have to take). 

The second group consists of 30 BA students enrolled in the program of English 
language and culture at the University of Amsterdam (Netherlands). These students 
are in the first year of the program and they all identified Dutch (and Dutch only) as 
their first language. Participants were aged between 19 and 27 years (M = 22.03; 
SD = 2.282) and they started learning English between 8 and 12 years of age (M = 9.90; 
SD = 1.918). Thus, the age range and the onset of acquisition range in the group of 
L1 Dutch speakers are relatively similar to the group of L1 German speakers. 

The third group contains 32 BA students enrolled in the program of English in 
the Department of Foreign Languages at the University of Bergen (Norway). These 
students are in the first year of said program. All participants identified Norwegian 
as their native language and several identified a Norwegian dialect as an addi- 
tional first language. Students were aged between 18 and 33 years (M = 20.94, 
SD = 3.079) and are thus comparable to the groups of L1 German and L2 Dutch 
speakers. However, at least some of them started learning English at a younger 
age than the speakers in the L1 German and L1 Dutch groups. Students were aged 
between 4 and 10 years when they started learning English (M = 6.84, SD = 1.725). 

The control group consists of 20 participants of native speakers of English. 
These participants were recruited through private contacts, the age range in this 
group is between 19 and 73 years (M = 40.00, SD = 14.287). 


5.2 Materials and procedure 


Participants were asked to complete a pen and paper questionnaire that consisted 
of a free choice answer task. The questionnaire contained 32 questions that tested 
the interpretation of English wh-questions in wh-+particle constructions. Partici- 
pants were unaware of the aim of the study, and had no time restrictions for com- 


13 All in all 23 speakers in this group identified as bi- or multilingual and since the other two 
experimental groups consisted either of monolingual speakers or of monolingual and dialect 
speakers, we did not further distinguish between monolingual and bi- or multilingual speakers 
in this group either. 
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pletion of the questionnaire. They were instructed to read a short scene, like the 
one illustrated below, and to answer a follow-up question: 


(36) Susie, Ann and Maggie are carpooling. On Mondays Maggie picks Susie 
up. On Tuesdays Ann picks Maggie up and on Fridays Susie picks Ann up. 


Each scene is followed by a free choice question. This follow-up question tests the 
variables presented in chapter 4, indicated here in the right-hand column of (37). So 
the scene in (36) is followed by one of the questions in the left-hand column in (37): 
(37) a. Which one picked Ann up? - which one - V - XP - Part(icle) 

b. Which one picked up Ann? - which one - V - Part — XP 


c. Who picked Ann up? who - V- XP - Part 


d. Who picked up Ann? who - V - Part — XP 


In (37a/b) the wh-constituent does not indicate any morphological marking, in 
(37c/d) on the other hand who (as opposed to whom) contains a mild bias towards 
subject morphology (cf. discussion in chapter 4 above). In (37a/c) the particle is 
stranded in sentence final position, in (37b/d), on the other hand, the particle 
is pied-piped. 

Each of the four question patterns illustrated in (37) occurs eight times through- 
out the questionnaire in randomized order. Data collection was distributed over 
four different undergraduate courses at Bergische Universitat Wuppertal, and all 
data were collected within three consecutive days. All courses received the same 32 
scenes and follow-up questions. However, the order of scenes and questions was 
varied and randomized for the four groups in order to control for pattern learning 
that might arise while participants complete the study. Since pattern learning did 
not turn out to be a determining factor, it is not taken into further consideration 
in the discussion of the results below nor in the questionnaires for the Dutch 
and Norwegian speakers. Here, all data were collected in a single session on a 
single day. All meta-data were collected in the respective Lis German, Dutch or 
Norwegian. 

Answers were coded as correct/target-like if participants provided answers 
that showed a subject-only interpretation for the wh-constituents. So for the 
questions illustrated in (37) the only correct/target-like answer would be Susie. 
All answers that showed an ambiguous interpretation between a subject and 
object interpretation (e.g. when a participant answered both Susie and Maggie) 
were coded as non-target like generalized V2 interpretations. Similarly, all answers 
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that showed an object-only interpretation for the wh-constituent were also coded 
as non-target like generalized V2 interpretations. For the examples above this 
would be allanswers that contained Maggie. All other answers, i.e. blank answers, 
obvious statements of confusion such as I don’t know or unexpected choices such 
as Ann, which occurred occasionally, were coded as missing values. 


5.3 Results 


The results for the experimental groups of German, Dutch and Norwegian L2 Eng- 
lish speakers and for the control group of L1 English speakers are summarized in 
tables 1-4 below: 


Table 1: generalized V2 readings - L1 German speakers 


German subject reading generalized V2 
which one -V - XP - Part 72.97% 27.03% 
which one - V- Part — XP 90.67% 9.33% 
who - V - XP - Part 83.82% 16.18% 
who - V - Part — XP 91.57% 8.43% 


Table 2: generalized V2 readings - L1 Dutch speakers 


Dutch subject reading generalized V2 
which one -V —- XP - Part 86.67% 13.33% 
which one - V- Part — XP 98.4% 1.6% 

who - V - XP - Part 91.67% 8.33% 

who - V- Part — XP 92.5% 7.5% 


Table 3: generalized V2 readings - L1 Norwegian speakers 


Norwegian subject reading generalized V2 
which one -V-XP- Part 94.514% 5.468% 
which one - V- Part — XP 98.829% 1.171% 
who - V - XP - Part 98.438% 1.562% 


who - V - Part — XP 97.657% 2.343% 
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Table 4: generalized V2 readings - L1 English speakers 


English subject reading generalized V2 
which one - V - XP - Part 100% 0% 
which one - V- Part — XP 100% 0% 
who -V - XP - Part 100% 0% 
who -V - Part — XP 100% 0% 


The results show that even very advanced L1 German speakers of L2 English 
speakers tend to interpret subject wh-questions in wh-+particle constructions as 
generalized V2 constructions, i.e. either as object questions or as ambiguous 
between subject and object interpretations. The effect is much stronger when the 
particle is stranded and strongest in the absence of (potential) Case-marking on 
the wh-constituent, where it reaches a level of almost 30%. If the particle is pied- 
piped in the wh-+particle constructions, error rates drop to levels below 10% for 
both wh-constituents and there is hardly any difference in the interpretation of 
questions introduced by who and which one respectively. 

For the L1 Dutch speakers, too, generalized V2 plays a significant role in the 
interpretation of wh-+particle constructions. The results further show that L1 
Dutch L2 English speakers with a very high proficiency level in English show 
tendencies that are similar to those observed for L1 German L2 English speakers. 
There seems to be GC/MG between generalized V2 in L1 Dutch and residual V2 in 
L2 English, particularly when the particle is stranded. The effect is again strongest 
(peaking just below 15%) in the absence of (potential) Case-marking on the wh- 
word. Pied-piping the particle also leads to a decrease in error rates, and in the 
wh-construction in which the wh-constituent is not Case-marked the generalized 
V2 effect is completely absent (1.6%). L1 Dutch L2 English speakers show native- 
like proficiency in their interpretation of this construction. Though the tendencies 
in conditions 1-3 are the same for both the German and the Dutch L2 English 
speaker groups, the overall error rates for L1 Dutch speakers are significantly lower 
than those for the L1 German speakers. In other words, the GC/MG effect for L1 
Dutch speakers is lower overall than for the L1 German speakers. When comparing 
the results for the fourth condition from the L1 Dutch speakers to the results in the 
same condition for the L1 German speakers, we notice another effect. In condi- 
tions two and four, i.e. in those constructions where the particle is pied-piped, L1 
German speakers produce the lowest overall error rates and the error rates for both 
conditions are almost on the same level. We already pointed out that L1 Dutch 
speakers show the same tendencies like the L1 German speakers in their error 
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rates in conditions 1-3. The overall error rate for condition four, however, is 
lower than the error rate in which the particle is not pied-pied and the wh-con- 
stituent is not Case-marked (i.e. condition 1), but it is at almost the same level as 
in condition 3, i.e. the wh-construction in which the wh-constituent is Case- 
marked (but the particle is not pied-piped). Thus, while error rates drop to 
native speaker level in condition 2 with pied-piping, in condition 4 pied-piping 
does not have the same effect. On the contrary, if the wh-constituent is who as 
opposed to which one, pied-piping and preposition stranding both pattern 
alike. 

The results for Norwegian L2 English speakers show target-like interpretations 
regardless of whether the particle is stranded or pied-piped and regardless of 
whether wh-constituents are overtly Case-marked or not. There seems to be a very 
small and only marginally significant effect for the first condition, i.e. the con- 
struction in which the wh-pronoun is not Case-marked and the particle is stranded. 
Closer inspection, however, reveals that the error rate of just over 5%, which is 
on the brink of being statistically significant, comes from a single speaker who 
interpreted all 8 occurrences of this construction as object questions. The other 
31 L1 Norwegian speakers did not make any errors in this condition. So, we will 
not take the question whether the error rate is significant or not into further 
account here. 

The results in table 4 clearly illustrate that native speakers of English per- 
formed as expected. They did provide some unexpected answers or occasion- 
ally left some answers blank. These were coded as missing values just like in the 
other groups. What is striking is that none of the L1 English speakers provided 
any answers that indicated an ambiguous or object-only interpretation for any 
of the question patters illustrated in (37). We interpret this high-level perfor- 
mance of the control group as an indication that the test is not intrinsically 
flawed. 

A 2x4 ANOVA that checks for statistical significance of the results between 
the various groups (L1 vs L2 speakers) and across the four test conditions (which 
one - V- XP - Part = type 1; which one - V- Part — XP = type 2; who — V - XP - 
Part = type 3, who - V - Part - XP = type 4) shows a significant effect of L1/L2 and 
of type 1 and type 3 (p < .001) and a significant effect for type 2 and type 4 (p < .01) 
for L1 German speakers. For L1 Dutch speakers results show a significant effect 
of L1/L2 and of type 1, 3 and type 4 (p < .01) and no significant effect for type 2 
(p > .05). For L1 Norwegian speakers results show a marginally significant effect 
of L1/L2 for type 1 (p < .05), which, as indicated above, we will ignore, and no 
significant effect for types 2, 3 and 4 (p > .05). The results of the between group 
comparison are summarized in Figure 1: 
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Fig. 1: subject interpretation in % 


5.4 Interpretation 


We can now return to the hypothesis and predictions we formulated in chapter 
4.3, repeated here for convenience: 


(34) Hypothesis: role of V2 in MG/GC of L1 V2 and L2 English speakers: 


a. 


L1 V2 speakers with L2 English are much more likely to assign an 
ambiguous interpretation to unambiguous subject-questions than L1 
English native speakers (and speakers of a L1 that is not V2) 

The effect of an ambiguous interpretation is strongest in contexts 
where neither Case-marking on the wh-word nor particle pied-piping 
is present 


(35) Further predictions: 


a. 


For L1 German speakers we predict the highest effect for those con- 
structions where the particle is in sentence final position and the 
wh-constituent is ambiguous between a subject and an object reading. 
For L1 Dutch speakers we predict the effect of the absence of Case-mark- 
ing on the wh-constituent to be lower than for L1 German speakers. 

For L1 Norwegian speakers we predict target-like interpretations for 
those constructions where the particle is pied-piped and a possibly 
very low effect for those constructions where the particle is stranded. 
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As the results from this study show, GC/MG in the interpretation of wh-+parti- 
cle constructions is an issue for German L2 English speakers that leads to a 
non-target generalized V2 interpretation for these constructions. The effect is 
strongest for those constructions where the particle is stranded and it peaks 
for the construction in which the particle is stranded and the wh-constituent is 
syncretic between nominative and accusative Case-marking. Note that if the L2 
grammar were simply to be restructured, we would expect the error rates of 
highly proficient L2 speakers to be much lower, ranging somewhere at the 
level of type 2 and type 4 constructions. At these levels, grammar competi- 
tion is much less of an issue because particle pied-piping is ungrammatical in 
German. 

For Dutch speakers we find a GC/MG effect as well. However, since Dutch, 
like English, allows for particle pied-piping to sentence medial position in subject 
wh-+particle constructions and also shows syncretism on wh-constituents it is 
impossible to determine which of the two factors is more relevant. What we can 
say, however, is that particle placement has a highly significant effect in condition 
1and 2. Here error rates are brought down from the highest level overall to native 
speaker like competence depending on particle placement. While particle place- 
ment has virtually no effect in L1 Dutch speakers in conditions 3 and 4, error 
rates in these conditions are significantly lower than in condition 1 and signifi- 
cantly higher than in condition 2. This conspires with the results from German, 
i.e. with Case-marking taking a back-seat to particle placement for speakers of 
L1 V2 L2 English. Hence, when comparing L1 German and L1 Dutch speakers to L1 
English speakers, there seems to be a GC/MG effect between generalized and 
residual V2 for the two L1 German and Dutch speaker groups. Thus, there seems 
to be a general V2 effect. However, the strength of the effect is different for the 
two L1 speaker groups, with L1 German speakers showing a much stronger effect 
than L1 Dutch speakers. This, we argue, indicates that V2 may not be as broad a 
parameter as frequently assumed. 

This is further corroborated by the fact that L1 Norwegian L2 English speakers, 
unlike L1 German/Dutch L2 English speakers, show target-like results in their inter- 
pretation of wh-+particle constructions regardless of whether the wh-constituent 
is overtly Case-marked or not and regardless of whether the particle is pied-piped 
or stranded. If there were a broad V2 parameter that applies cross-linguistically, we 
would not expect such a strong counter-balancing effect from particle placement 
and morphological marking in Norwegian. Additionally, following Roeper (2016), 
we take this as an indication that if the GC/MG effect is observed only in specific 
groups of advanced V2 speakers then it is less likely to be accounted for by a 
production or processing error. Rather, it indicates a representational conflict in 
terms of competing grammars. 
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6 Further predictions and conclusion 


Our observations on the role of GC/MG in German, Dutch and Norwegian speakers 
allow us to make the following predictions on acquisition path in L2 English for L1 
speakers of other V2 languages: 

For Icelandic, we expect L1 Icelandic speakers of L2 English to pattern with 
L1 Norwegian speakers because verb-particle constructions in Icelandic show 
the same distribution as in Norwegian, as is shown in the examples in (38) taken 
from Svenonius (1996): 


(38) a. Pjönninn purrkadi rykið af 
waiter wiped dust off 
‘The waiter wiped off the dust.’ 
b. Pjönninn þurrkaði af rykiö 
waiter wiped off dust 
‘The waiter wiped off the dust.’ 


As we can see in the examples in (38) particles in Icelandic can be either pied-piped 
or stranded and hence L1 Icelandic speakers should pattern with L1 Norwegian 
speakers, i.e. we do not expect any interference effect when they apply the Icelan- 
dic grammar to English wh-+particle constructions. If anything, we expect L1 
Icelandic speakers to be even more robust than L1 Norwegian speakers in their 
interpretations of wh+particle constructions in English, at least under the analysis 
that in Icelandic, as in Yiddish, V2 is movement to T rather than C (cf. Diesing 
1990). In this respect, too, Icelandic patterns much more closely with English 
than any of the other V2 languages investigated where V2 is movement to C. 

Swedish does not allow particle stranding, as is illustrated by the following 
data from Svenonius (1996): 


(39) a. *Kyparen torkade dammet av. 


waiter wiped dust off 
‘The waiter wiped off the dust.’ 

b. Kyparen torkade av dammet. 
waiter wiped off dust 


‘The waiter wiped off the dust.’ 


Hence, L1 Swedish L2 English speakers should pattern with the other Scandinavian 
V2 speakers and should not show a GC/MG effect in these constructions, since the 
particle placement in sentence final position is not an option, and even if there is 
CG/MG, there is again no conflict between the L1 and the L2 data. 
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Danish is the more interesting test-case here. Consider the following data 
from Svenonius (1996): 


(40) a. Tjeneren tørket støvet af. 


waiter wiped dust off 
‘The waiter wiped the dust off.’ 

b. *Tjeneren tørket af støvet. 
waiter wiped off dust 


A brief comparison of the data in (39a/b) and (40a/b) shows that Danish is the 
mirror image of Swedish with respect to verb-particle constructions. Hence Dan- 
ish, unlike Swedish, allows particle stranding but does not allow particle pied- 
piping. In that respect, Danish patterns with the other Germanic V2 languages 
tested here, i.e. German and (to some extent) Dutch. We may therefore expect L1 
Danish speakers of L2 English to produce higher error rates than L1 Norwegian/ 
Icelandic/Swedish L2 English speakers and thus to pattern with the L1 Dutch/ 
German L2 English speakers in this respect. Further testing for these constructions 
is in progress. 

From the studies discussed in this paper we can conclude that unlearning V2 
in SLA is a challenging task. L1 speakers of a V2 language show a GC/MG effect in 
wh-+particle constructions in L2 English that can be attributed to the V2 properties 
of their L1. 

However, when comparing the results from L1 speakers of various V2 lan- 
guages, we can also see that these speakers do not behave as a uniform group in 
L2 English. Rather, the specific properties of the V2 language need to be taken 
into account in order to determine how strong the GC/MG effect is going to be. We 
take this as an indication of two things. First, the data seem to suggest that a 
broad V2 parameter does not exist. Instead, several factors seem to conspire in 
generating a V2 effect, which can be dissected into a number of micro-parameters 
or micro-cues (cf. e.g. Westergaard 2009, 2014; Biberauer et al. 2014, and many 
others). Second, the data seem to indicate that the effect cannot be related to 
processing difficulties in L2. If processing were a relevant factor, the various 
groups of L1 speakers should pattern uniformly as they are all very advanced L2 
speakers with relatively similar levels of competence. Since they do in fact pattern 
differently, we argue for an explanation in terms of GC/MG and underlying gram- 
matical representations instead. 
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Syntactic or semantic gender agreement in 
Dutch, German and German 

learner Dutch: a speeded grammaticality 
judgement task 


Abstract: Dutch is currently undergoing a ‘resemanticisation’ of its pronominal 
gender, in which syntactic agreement is replaced with a system in which pronouns 
are chosen in accordance with the degree of individuation of the antecedent. Cur- 
rent accounts of resemanticisation link the process to the extent to which the 
three-way nominal gender distinctions are still entrenched. Using experimental 
data gathered with speeded grammaticality judgements from speakers of both 
Netherlandic and Belgian varieties of Dutch, of German, and of German learners 
of Dutch, we unambiguously relate the rise of semantic agreement in Dutch to an 
increased uncertainty with respect to grammatical gender. In addition, reaction 
time measurements suggest that an agreement system with a strong propensity 
towards grammatical agreement allows for faster processing of agreement rela- 
tions than systems in which semantic agreement plays a larger role. 


Zusammenfassung: Im Niederlandischen findet zurzeit eine ‘Resemantisierung’ 
des pronominalen Genus statt, durch die syntaktische Kongruenz zunehmend 
durch ein System ersetzt wird, in dem die Wahl pronominaler Formen vom Grad 
der Individuierung des Antezedenten abhängt. Es wurde vermutet, dass der Pro- 
zess mit dem Grad der Verankerung (entrenchment) des Drei-Genera-Systems 
zusammenhängt. Anhand von Grammatikalitätsurteilen unter Zeitdruck (speeded 
grammaticality judgements) mit Sprechern niederländischer und belgischer Varie- 
täten und mit Sprechern des Deutschen durchgeführt, sowie auch mit deutschen 
Niederländischlernern, demonstrieren wir eindeutig den Zusammenhang zwi- 
schen der Zunahme semantischer Kongruenz und einer Unsicherheit in Bezug auf 
das grammatische Genus. Darüber hinaus sprechen die Analysen der Reaktions- 
zeiten dafür, dass sein stark grammatisch basiertes Genussystem eine schnellere 
Verarbeitung von Kongruenzbeziehungen erlaubt als ein System, in dem semanti- 
sche Kongruenz eine größere Rolle spielt. 
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1 Introduction 


As is well-documented, Dutch is currently undergoing a transition by which a 
predominantly syntactic system of pronominal gender is resemanticised, which 
means replaced with a system in which semantics plays a larger role (e.g., Audring 
2009a). This yields several contrasts with the neighbouring German language, 
which has preserved a more conservative pronominal gender system. Two ofthese 
contrasts are illustrated in (1), adopting Dutch examples from Audring (2009a: 
73, 98):! 


(1) Semantic gender agreement in Dutch vs. German. 


(la) ..maar het meisje [...] hoe oudis ze dan? (rarely: het ‚it‘) 
..aber das Mädchen [...] wie alt ist es dann? (also: sie ‘she’) 
..but the girl [...] how old is she/it then? 

‘About the girl [...] how old is she really?’ 

(1b) da’s zo handig met wol...] want je kunt ‘t 
dasist so praktisch mit Wolle [...] denn du kannst sie 
that is so handy with wool[...] because you can it 
overal tussen stoppen (also: hem, ze) 
überall zwischen stopfen (not: ihm, es) 
everywhere between stuff (not: him, her) 

‘that’s so handy about wool [...] because you can stuff it between 
everything’ 


The example in (1a) concerns the noun meisje/Mädchen ‘girl’, which refers to an 
animate entity ranking high on the so-called Individuation Hierarchy (Sasse 1993; 
Siemund 2008), as do all entities that carry biological gender. In this case, there 
is a conflict between biological gender (female) and the noun’s neuter gender, 
which yields the possibility to use feminine pronouns like Dutch ze or German 
sie ‘she’, rather than the neuter pronoun (Dutch het, German es ‘it’). In this and 
comparable cases in the animate domain, Dutch more commonly applies the 
so-called natural gender rule than German (Kraaikamp 2017: 63-73). Example (1b) 
concerns the low end of the Individuation Hierarchy, in that a mass noun, viz. wol/ 


1 We would like to thank Marc Brysbaert at Ghent University and Josje Verhagen at Utrecht 
University for assistance in recruiting test persons and for letting us use psycholinguistic testing 
infrastructure. Thanks are also due to Holger Hopp for sharing the experimental task with us and 
to two anonymous reviewers for their constructive criticism and comments. 
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Wolle ‘wool’, refers to an unspecific quantity of a substance. In such conditions, 
Dutch shows an increasing tendency to use the neuter pronoun (het), even if the 
noun involved has common gender (or is feminine in three-gender varieties of 
Dutch). Whereas the same phenomenon exists in German as well (see Audring 
2009a: 193 for examples), it is by no means as widespread as in Dutch (Kraaikamp 
2017: 74). (1b) also hints at a further difference between Dutch and German: in 
line with its tryadic gender system, German uses masculine and feminine pro- 
nouns to refer to masculine and feminine nouns, respectively. Most varieties of 
Dutch, however, have collapsed masculine and feminine gender into common 
gender (or de-nouns), and use masculine pronouns for syntactic agreement with 
this category. Example (1b) is a case in point: the pronoun hem ‘him’ is much 
more commonly used than ze ‘her’ in Dutch, even though the noun wol is, as a 
cognate of German Wolle ‘wool’, historically feminine. 

This article zooms in on the changes illustrated in (1b), and investigates the 
psycholinguistic status of the variants involved. Current research on resemantici- 
sation has primarily dealt with data from language usage and has proposed a 
number of explanations for the phenomenon. While these are, essentially, psycho- 
linguistic in nature, there have been few attempts to tap into speakers’ knowledge 
of their gender system more directly. Section 2 provides an overview of research 
carried out by means of usage data and formulates several hypotheses regarding 
the psycholinguistics of pronominal gender in Dutch. Section 3 then describes the 
method adopted in our own investigation, which is geared at providing more 
insight into the status of syntactic vis-a-vis semantic gender agreement by means 
of two speeded grammaticality judgement tasks, targeting the relation between 
syntactic and semantic agreement (Experiment 1), and the masculine-feminine 
distinction (Experiment 2), respectively. Section 4 describes the results, and in sec- 
tion 5 some conclusions are drawn. 


2 Production data on Dutch and German 
pronominal gender: Overview 


As said above, this article zooms in on the resemanticisation of pronominal 
gender in the inanimate domain, which was illustrated by means of (1b). The 
term ‘resemanticisation’, coined by Wurzel (1986), considers that all gender 
systems are assumed to have a semantic core (Corbett 1991: 63), including the 
common ancestor languages of Dutch and German. Audring (2006: 108) there- 
fore evaluates the change observed in Dutch as a result of semantic agreement 
merely becoming more visible (again), rather than as an original development 
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(cf., among others, Schwink 2004 and Jobin 2011 on Proto-Germanic, Matasovic 
2004 and Luraghi 2011 on Proto-Indo-European, and Kraaikamp 2017 for further 
evidence on Dutch). 

Whether semantic or syntactic agreement is used ina given situation, depends 
on many factors, some of which are inherent to the referent of the pronoun (e.g., 
whether it is highly individuated or not) or the noun involved (e.g., usage frequency, 
cf. De Vos/De Vogelaer 2011), whereas others relate to the syntactic or discourse 
context in which the pronoun or its antecedent are used. Among the factors that 
have been proposed for Dutch are the pronoun’s grammatical case, the distance 
between antecedent and pronoun (Audring 2009a), the antecedent’s definiteness 
and its grammatical function, the verb of both the antecedent and the pronoun 
sentence (De Vos 2014), anaphoric vs. deictic reference, and the presence of a 
gender marker in the antecedent NP (Kraaikamp 2017). In addition, in a change- 
in-progress like resemanticisation in Dutch, sociolinguistic factors like speakers’ 
gender, age and social background may play a role as well (see Audring 2009a: 
168f. and De Vos 2014: 166-186 for discussion). Not all of these parameters can 
be included in a comparative study, however, even more so because many of 
them interact and their effects should therefore be studied in a multivariate 
analysis of a larger dataset than the one used below. For the present investiga- 
tion, most of them will be kept constant. 

As for the most important factor, viz. the semantics of the pronoun’s referent, 
the investigation needs to take into account variation in the way semantic agree- 
ment is implemented. While the use of neuter pronouns in reference to non-neuter 
mass nouns has been documented throughout (and beyond) the West Germanic 
languages, there appears to be variation with respect to inanimates ranking more 
highly on the Individuation Hierarchy: for instance, Audring’s (2009a) analysis 
of the Spoken Dutch Corpus zooms in on the geographical centre of the Dutch 
language area (Holland or the broader Randstad area), and reveals a tendency to 
generalize masculine pronouns for highly specific and/or delineated referents, 
typically count nouns, even if these have neuter gender (e.g., a count noun like 
het boek ‘the book’ would increasingly be referred to with hij ‘he’ or hem ‘him’). 
Similar examples from peripheral areas in the Netherlands and from Belgium are 
lacking, however; it appears as if in these areas neuter het ‘it’ is expanding its use 
in referring to all inanimates, including count nouns (e.g., a count noun like de 
doos ‘the box’ would be referred to with het ‘it’, albeit less frequently than mass 
nouns). In a schema such as (2), the variation is described in terms of different 
cut-off points between the usage ranges of different pronouns: whereas Holland 
distinguishes between highly and lowly individuated inanimates for pronominal 
reference, the semantic system elsewhere treats inanimates as a single category 
that triggers, from a semantic point of view, the use of het ‘it’. 
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(2) Semantic gender in Dutch: different cut-off points. 


Human > Other > Bounded > Specific > Unspecific mass, 


animate object/ mass Unbounded 
Abstract abstract 
Holland (cf. Audring HIJ/ZI) HIJ HET 
2006: 103): 
Belgium, periphery of HIJ/ZIJ HET 


the Netherlands 
(cf. Klom/De Vogelaer 
2017): 


Similar variation is known from English, where the standard variety has general- 
ized it for inanimates and animates with unknown (or backgrounded) biological 
gender, much like the varieties of Dutch spoken in the periphery ofthe language 
area are doing. Several non-standard varieties have alternative systems at their 
disposal, however, which resemble the Hollandic system, in which he/him are 
used in combination with highly individuated inanimates (see cask in example 3a 
from Siemund 2008). The few data that are available for German, which has only 
marginal proportions of semantic gender, indicate that it behaves like peripheral 
varieties of Dutch: a few exceptions notwithstanding, deviations from lexical 
gender are explained either as effects of the natural gender rule, or as neuter 
pronouns used in line with a masculine/feminine referent’s low individuation 
(Kraaikamp 2017: 68-73; cf. example 3b). 


(3) Examples of semantic gender in English and German. 


(Ga) Thick there cask ‘ont hold, tidn no good to put it [the liquid] in he [the cask] 
(Southwest of England; Siemund 2008: 46) 


(3b) Wir müssen zuerst Erde pm entsorgen. Ich hoffe, dass es,, 
we must first earth dispose.of. I hope that it 
mit einem mal transportiert werden kann. (Kraaikamp 2017: 70) 
with one time transported become can 
‘First we must dispose of the earth. I hope it can be transported in one 
time.’ 


ut 


Apart from different implementations of semantic gender in the inanimate domain, 
Dutch also shows extensive variation with respect to the degree of semantic gender 
that is observed. In general, Netherlandic Dutch has developed a stronger prefer- 
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ence for semantic gender, whereas Belgian varieties show higher proportions of 
syntactic agreement (compare Audring 2009a and De Vos 2014). This is correlated 
to the fact that Belgian varieties have preserved richer gender marking in the noun 
phrase: in the inflectional paradigm of articles and/or adjectives, many Belgian 
varieties still distinguish between masculine and feminine gender (usually by 
attaching an -en-suffix to articles and adjectives preceding masculine nouns) and 
therefore qualify as three-gender varieties of Dutch. In contrast, the two-gender 
varieties spoken in the northern half of the language area have merely preserved 
the common-neuter distinction (see Van Ginneken 1934f. and 1936f. for maps). 
While suffixes marking the masculine-feminine distinction in the adnominal 
domain are associated with non-standard varieties, the distinction between two- 
and three-gender varieties is also relevant for Standard Dutch, in that it is reflected 
in pronominal reference: the south, most notably Belgian Dutch, (also) uses ze to 
refer to feminine de-words, whereas the north has generalised masculine hij/hem 
‘he/him’ for all de-words. The three-gender area appears to be eroding: Hoppen- 
brouwers (1983) documents the loss of the masculine-feminine distinction in both 
pronominal and adnominal gender in a North-Brabantic variety of Dutch, and 
relates it to processes of dialect levelling and loss. De Vogelaer/De Sutter (2011) 
show that, within the three-gender area, the varieties with the richest adnominal 
system are also the most resilient ones with respect to resemanticisation.” 

The tight link between richness of the adnominal paradigm and degree of 
resemanticisation is observed in other languages: both in Germanic and in 
Romance, resemanticisation appears to have affected two-gender systems more 
systematically than three-gender systems (Siemund 2008; Fernändez-Ordönez 
2009; Audring 2009a: 198f.). In addition, Audring (2009a: 211, 2009b) reveals a 
typological link between purely pronominal gender marking and semantic gen- 
der, with distinctions relating to individuation (count/mass, animate/inanimate) 
ranking among the most common semantic parameters steering pronominal 
agreement. This suggests a causal link between the (partial) collapse of the inflec- 
tional system in the adnominal domain, and resemanticisation, in that syntactic 
systems of pronominal agreement can only be upheld if they are ‘supported’ by 


2 The state border between Belgium and the Netherlands does not coincide at all with the isoglos 
separating the two- and three-gender area. Since the latter area stretches out until deep in the 
Netherlands, it is common in the dialectological literature on the topic to distinguish northern 
and southern varieties rather than Netherlandic and Belgian ones. Since standardisation processes 
have exerted stronger pressure on the three-gender varieties in the Netherlands, the two- and 
three-gender systems have become associated with Netherlandic and Belgian Dutch, respectively, 
and we conveniently describe the contrasts as differences between national varieties of Dutch (see 
Klom/De Vogelaer 2017 for elaboration, however). 
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an adnominal system. Ultimately, resemanticisation would then be explained 
psycholinguistically by means of scenarios that can be called ‘learnability’ or 
‘entrenchment accounts’: in a language with covert gender like Dutch, nouns’ 
gender can only be learned through the behaviour of associated words. To be 
successfully acquired, then, gender must be properly entrenched in the input. 
Audring’s findings indicate that this is only the case in languages with sufficiently 
rich and consistent gender marking in the adnominal system.? If gender marking 
no longer allows learners to acquire the system, language users increasingly have 
to resort to semantic rules in pronominal reference, which, in the long run, could 
lead to language change. 

There are different hypotheses on the precise features of the adnominal sys- 
tem that ensure grammatical gender to be properly entrenched. Audring’s (2009a: 
172) ‘mismatch hypothesis’ points out that the contrast between two genders in 
the adnominal domain and three pronominal genders may be problematic. Since 
resemanticisation is also observed in three-gender varieties of Dutch, however, 
incomplete patterns of syncretism may suffice to trigger the change. In this vein, 
De Vogelaer/De Sutter (2011: 195) discuss the role of n-deletion in paradigms in 
which -n is used as the main marker for masculine gender, and also illustrate the 
role of the masculine indefinite article ne(n), which apparently causes East- 
Flemish varieties of Dutch to lag behind in resemanticisation in comparison to 
West-Flemish, which has an invariable indefinite article. The loss of gender 
marking on the indefinite article is also mentioned by Kraaikamp (2017: 126f.), 
who points out that gender marking has been lost in more agreement targets in 
Dutch, such as most possessives, and attributive articles in definite NPs. Apart 
from the distinctiveness of agreement suffixes and the number of agreement 
targets, the gender assignment system may also play a role: Dutch gender is 
assumed to be by and large arbitrary (Audring/Booij 2009), which contrasts with 
languages in which lexical gender is motivated on formal or semantic grounds 
(e.g., German, in which nouns on schwa tend to be feminine, or long objects tend 
to be masculine; see Köpcke/Zubin 1983), and/or even morphologically marked 
on the noun (e.g., Italian has masculines on -o and feminines on -a). 


3 One can speculate about the reasons why pronominal agreement would not fulfill such an 
entrenchment requirement. A possible factor would be the, on average, larger distance between 
pronouns and their controllers (the noun), and their, in typological perspective, stronger pref- 
erence for semantic agreement. Both blur the agreement relationship between pronouns and 
their antecedents. In addition, languages like Dutch use surprisingly few pronouns to refer to 
inanimates. The latter point can be illustrated with data from De Vos (2014), who finds a mere 
3463 references to inanimates in the entire Flemish part of the Spoken Dutch Corpus, which 
consists of about 3 million words. 


278 —— Gunther De Vogelaer et al. 


Numerous acquisition studies illustrate that gender, as entrenchment accounts 
would predict, isindeed much harder to acquire in Dutch than in German. Whereas 
mistakes in the adnominal domain are rare even in young German-speaking 
children (Mills 1986; Szagun et al. 2007), these abound in Dutch (Van der Velde 
2003: 128, 138; Cornips/Hulk 2006; Blom/Polisenskä/Weerman 2008). In the pro- 
nominal domain, Dutch-speaking children conceive of pronominal gender pre- 
dominantly as a semantic system; syntactic gender is acquired later (De Houwer 
1987; De Vogelaer 2010; De Vos/De Vogelaer 2011). Children growing up in the 
three-gender area apply the grammatical gender system more consistently, both 
in the adnominal and pronominal domain (see, respectively, Cornips/Hulk 2006 
and De Vogelaer 2010). Not surprisingly, the grammatical gender system poses an 
even bigger challenge for non-native learners of Dutch (see Cornips/Hulk 2006; 
Blom/PoliSenska/Weerman 2008, and Loerts 2012 for adnominal gender, and van 
Emmerik et al. 2009 for pronouns). German learners of Dutch are an exception to 
this, however. Since these have been found to use a “direct gender translation 
strategy” (Sabourin/Stowe/De Haan 2006: 24), and given extensive correspond- 
ences between German and Dutch gender both on the systemic and the lexical 
level, syntactic agreement should be acquired with relative ease. Since they have 
also been found to exploit knowledge of their native language (L1) even in cir- 
cumstances where correlations between the L1 and the L2 are missing (Lemhöfer/ 
Schriefers/Hanique 2010: 157), an investigation into German-speaking learners 
of Dutch may reveal whether some of the recent findings regarding the Dutch 
gender system transfer to German as well. 

The results of acquisition studies are, broadly speaking, in line with the 
predictions yielded by entrenchment accounts of resemanticisation. Yet these 
acquisition data, as other production data, do not provide any direct insight 
into the grammatical knowledge of the language users involved. Even if rese- 
manticisation is observed, for instance, it cannot be determined to what extent 
this is caused by a weakened entrenchment of lexical gender or whether this 
relates to the acceptability of syntactic agreement being affected. Changing usage 
preferences may also be explained by conscious attempts to adopt a system 
increasingly favouring semantic agreement, especially since resemanticisation 
appears to be most strongly observed in Holland, which is the normative centre 
of the Dutch language area. Therefore, this article aims at tapping more directly 
into the Dutch and German gender system, using data from a psycholinguistic 
experiment carried out on German, Netherlandic Dutch, and Belgian Dutch 
speakers, and on German learners of Dutch. The following hypotheses will be 
explored: 

— Syntactic agreement is expected to be the dominant agreement mode in 

German, and be more stable in Belgian Dutch than in Dutch from the Nether- 
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lands. German learners of Dutch are expected to rely on lexical gender, 
which means use syntactic agreement, more than on semantics. 

— German still shows a stable three-gender system, whereas feminine gender 
is vulnerable in Dutch, and even no longer found in Netherlandic varieties. 

- Semantic agreement is more strongly observed in Netherlandic Dutch than 
in Belgian Dutch. German shows a non-negligible amount of semantic 
agreement, too, and in particular allows combinations of lowly individuated 
non-neuter nouns and neuter pronouns. 


3 Method 


3.1 Participants 


The method adopted for this psycholinguistic investigation is the so-called 
speeded grammaticality judgement task: participants are asked to evaluate the 
grammaticality of a number of test sentences as fast as possible. Both the partici- 
pants’ evaluations and their reaction times provide insight into their underlying 
grammatical knowledge. 


[T]he speeded presentation of the stimuli and the rapidly enforced judgement are taken to 
reflect processing strategies because the pace of the task (a) forces the parser to adopt its 
preferred parsing route and (b) does not allow for enough time to complete reanalysis [...]. 
The rationale underlying the speeded judgement paradigm is that, under time pressure, 
sentences dispreferred by the parser elicit lower accuracy scores and higher reaction times 
than comparable control sentences. (Hopp 2007: 238) 


Speeded grammaticality judgements allow investigating separately the role of 
semantic vis-a-vis syntactic agreement in sentence processing, because all con- 
ceivable combinations of noun gender and particular pronouns can be tested, 
including infrequent and ungrammatical patterns. As such they provide informa- 
tion that remains invisible in an analysis of production data, which primarily 
yield insight into which variants are preferred (cf. Tremblay 2005: 159). In order to 
test the hypotheses formulated above, two experiments were developed, focusing 
on the alternation between syntactic and semantic agreement for masculine and 
neuter antecedent NPs, and on the masculine-feminine distinction, respectively. 
Both experiments were carried out in four groups of participants, viz. L1 speakers 
of a Netherlandic variety of Dutch (recruited in Utrecht; n=23), L1 speakers of a 
Belgian variety of Dutch (recruited in Ghent; n=25), German learners of Dutch 
(recruited in Münster; n=28), and L1 speakers of German (recruited in Münster 
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and Vienna; n=20). All participants were university students in the age range 
20-45 years. 


3.2 Material 


The investigation focused on mass nouns, since only lowly individuated nouns 
behave uniformly with respect to resemanticisation (see above). To address our 
hypotheses, nouns of a given lexical gender were combined with certain pro- 
nouns, with Experiment 1 testing the effect of semantic vis-a-vis syntactic agree- 
ment, and Experiment 2 testing the resilience of feminine gender. Of nine possible 
gender-pronoun combinations, two were excluded, because they do not occur in 
production data (neuter noun - feminine pronoun), or because their behaviour 
can be predicted on the basis of other categories (the combination feminine noun 
— neuter pronoun is expected to behave similarly as masculine noun — neuter 
pronoun). The experimental items were selected on the basis of both the German 
and the Dutch gender system, as described in the Duden (2015 edition) and the 
Grote Van Dale (2015 edition), respectively. For Dutch gender, the opposition 
between masculine and feminine de-words was considered, since the investiga- 
tion also targets three-gender varieties of Dutch. Some investigations on Dutch 
gender have yielded frequency effects (e.g., De Vos/De Vogelaer 2011; De Vogelaer 
2012), so usage frequency was included in the analysis, too. Whereas the nouns 
were not selected to vary systematically in frequency, and only cover a limited 
frequency range, it was investigated whether frequency affected the acceptance of 
the conditions in both experiments, by adding the frequency of the nouns accord- 
ing to the subtlex-NL corpus (Keuleers/Brysbaert/New 2010) to all models as a 
continuous predictor variable. 

All nouns are cognates in Dutch and German, with identical gender in both 
languages (however, with one exception)“; hence, the Dutch and German versions 
of the test consisted of maximally equivalent test sentences. Both experiments 
consisted of 75 sentences each, which means 36 experimental items and 39 fillers 
(15 grammatical and 24 ungrammatical ones), which were included to avoid rou- 
tine answering strategies (cf. Hopp 2007: 240). The order of the test items was 


4 The one exception, viz. stof/Stoff (‘fabric’), has feminine gender in Dutch and masculine in 
German, which in fact allows detecting a “direct gender translation strategy” (Sabourin/Stowe/De 
Haan 2006: 24) on the part of the German learners of Dutch. Indeed they do not use any feminine 
pronouns in the production task and consistently rate feminine pronouns as ungrammatical (see 
Urbanek et al. 2017: 162-164 for further discussion). 
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randomised, with experimental and filler sentences alternating in an unpredicta- 
ble manner. 

The test sentences from Experiment 1 represent four conditions, with 18 
masculine nouns and 18 neuter nouns, of which half are combined with a mas- 
culine subject pronoun (Dutch hij, German er), and the other half with a neuter 
pronoun (Dutch het, German es). In Experiment 2, four conditions were tested 
as well: half of 18 masculine and 18 feminine nouns combine with masculine 
subject pronouns (Dutch hij, German er), and the other half with a feminine 
pronoun (Dutch ze, German sie). Since the masculine test items were used in 
both Experiment 1 and 2, the total number of nouns in the test equals 54. To 
ensure maximal comparability across the experimental conditions, two versions 
of both experiments were designed, in which test items were combined with a 
different pronoun, yielding A and B versions of both experiments. Table 1 pre- 
sents a few examples of test sentences from both versions (of the Dutch test). 


Table 1: Examples of test sentences in Dutch 


List A: List B: 


Experiment 1 


De suiker ,. is gevaarlijk, want het, ,, is oud. 
The sugar is dangerous because it is old. 

De lijm,,.,. Vloeit niet, want hij, is droog. 
The glue does not flow because he is dry. 

Het gras, ,, brandt niet, want het, , , is sappig. 
The grass does not burn because it is juicy. 
Het koper „„ glanst sterk, want hij, 

The copper shines brightly because he is new. 


masc. 


is nieuw. 


De suiker ,. is gevaarlijk, want hij, is oud. 
The sugar is dangerous because he is old. 

De lijm ,. vloeit niet, want het „„ is droog. 
The glue does not flow because it is dry. 

Het gras, brandt niet, want hij, is sappig. 
The grass does not burn because he is juicy. 
Het koper ,, glanst sterk, want het, is nieuw. 


neut. neut. 


The copper shines brightly because it is new. 


neut. 


Experiment 2: 


De maïs „ smaakt goed, want ze, is vers. 
The maize tastes well because she is fresh. 
De azijn „. brandt wat, want hij is pikant. 
The vinager burns a bit because he is spicy. 
De zeep,,,, kost veel, want hij, is mild. 

The soap costs much because he is mild. 


masc. 


De maïs „„„ smaakt goed, want hij a is vers. 
The maize tastes well because he is fresh. 

De azijn „. brandt wat, want ze, is pikant. 
The vinager burns a bit because she is spicy. 
De zeep,,,, kost veel, want ze, is mild. 


The soap costs much because she is mild. 


fem. 


De soep ruikt lekker, want ze, is gekruid. Desoep,,, ruikt lekker, want hij. is gekruid. 
The soup smells nice because she is seasoned. The soup smells nice because he is seasoned. 


masc. 


Since the likelihood of semantic agreement in Dutch is influenced by the syntax 
of the antecedent NP (e.g., definiteness) as well as by the predicate of both the 
antecedent clause and the pronoun clause (De Vos 2013, 2014), test sentences uni- 
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formly contained antecedents used in a definite NP, in a clause with an activity 
verb (n=32), a state verb (n=32), or a combination of the copula zijn/sein ‘to be’ 
and an adjective (n=8). The activity and state verbs in the antecedent sentences 
were followed by an adverb to keep the length of the sentence equivalent. The 
pronoun sentence was invariably introduced by the complementizer want/denn 
‘because’, followed by the anaphoric pronoun, the copula zijn/sein ‘to be’ and 
an adjective. All lexical items consisted of maximally three syllables. 

The ungrammatical fillers contained ungrammatical plurals, ungrammati- 
cal word orders, and false verb agreements. The grammatical filler sentences 
were modelled after the ungrammatical ones, but did not contain any 
ungrammaticalities. 


3.3 Procedure 


The two experiments were designed with the software package E-Prime (Schneider/ 
Eschman/Zuccolotto 2012). Sentences were shown in a word-by-word fashion on a 
computer. Each word was shown for 250 milliseconds (ms) plus 18ms per letter, 
before a blue screen appeared and participants were asked to evaluate the sen- 
tences’ grammaticality by means of a red (ungrammatical, rightmost button) and 
green button (grammatical, leftmost button). The maximal time allotted for the 
judgements was 4 seconds. In between both experiments, participants were 
administered a language background questionnaire (LSBQ, Anderson et al. 2017). 
After the second experiment they were asked to take part in a production test, in 
which they had to fill out pronouns in a questionnaire containing the same test 
sentences as the experiments. For the analysis carried out in this contribution, 
the language background questionnaire was merely used to detect participants 
with special backgrounds (e.g., bilingual education, non-native speakers, ...); 
results of the production test are not analysed here (but see Urbanek et al. 2017). 
The entire procedure took about 30 minutes per participant. 


4 Results 


4.1 Experiment 1: syntactic vs. semantic gender 


Experiment 1 was designed to evaluate the status of syntactic and semantic 
agreement in the language varieties involved. Nouns of masculine and neuter 
gender were combined with either a masculine or a neuter pronoun, constituting 


Syntactic or semantic gender agreement — 283 


four different conditions. Given the fact that all nouns involved were mass nouns, 
combinations with masculine nouns present unambiguous examples of syntactic 
agreement (masculine > hij/er ‘he’) or semantic agreement (masculine > het/es 
‘it’). Combinations of neuter nouns with masculine pronouns are neither moti- 
vated by syntactic, nor by semantic agreement; for combinations of neuter nouns 
with neuter pronouns, syntactic and semantic agreement match. While a detailed 
analysis of the data is carried out below using mixed effect models, a first glance 
at the overall results in Figure 1 already shows that, in general, combinations with 
neuter nouns trigger the clearest evaluations: the combination of neuter nouns 
with neuter pronouns yields the highest acceptance ratios, and the combination 
of neuter nouns with masculine pronouns is most strongly, but not across the 
board, judged ungrammatical. This corresponds to the fact that syntactic and 
semantic agreement have matching outcomes for neuter nouns. The results with 
masculine nouns, for which syntactic and semantic agreement conflict, tend to 
be more mixed. A slight preference for masculine pronouns is observed in the 
German L1 speakers and the German learners of Dutch, and for neuter pronouns 
in both Netherlandic and Belgian L1 speakers of Dutch. 
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Fig. 1: Grammaticality judgements on syntactic vs. semantic agreement 
To analyse the differences between the tested groups, we used logistic mixed 


effects models with ‘choice’ (grammatical, ungrammatical) as dependent variable 
and random intercepts for items and participants. As a complex model that 
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included the three-way interaction between (participant) ‘Group’, ‘Pronoun’ and 
‘Antecedent’ (gender: masc/neut) did not converge, and as our research questions 
do not concern potential main effects of ‘Group’ as far as the judgements are con- 
cerned, we conducted separate subset models, testing for effects of Pronoun, 
Antecedent (masc/neut), and their interaction in each of the four groups. A further 
parameter, ‘Frequency’, was not included in the final model, since an initial anal- 
ysis including this factor showed that neither a main effect of Frequency nor an 
interaction with another factor was observed in the three varieties of Dutch in the 
investigation. The results of these analyses are summarized in Table 2, which 
includes z-scores as a measure for effect size, and p-values for significance. 


Table 2: Experiment 1: Mixed effect modelling of factors ‘Antecedent’, ‘Pronoun’ and their 
interaction, for four participant groups 


Antecedent (masc/neut) Pronoun Antecedent * Pronoun 

Dutch (NL) ns. z=10.40 z=-5.68 
p<.001 *** p<.001 *** 

Dutch (B) z=1.90 z=10.82 z=-8.94 

p=.05 p<.001 *** p<.001 *** 
L2 Dutch n.s. z=5.44 z=-8.90 
(L1 German) p<.001 *** p<.001 *** 
German z=1.90 z=3.90 z=-14.45 

p=.05 p<.001 *** p<.001 *** 


These results show a significant interaction between Pronoun and Antecedent 
(masc/neut) in all groups, which was also visible in Figure 1 through the fact that 
the two pronouns do not have the same acceptability depending on the antecedent. 
This is the expected effect of syntactic agreement. More interestingly, the data also 
reveal a main effect of pronoun in all groups, because the pronoun het/es ‘it’ is 
overall more acceptable than the pronoun hij/er ‘he’. This is most strongly the 
case in the L1 varieties of Dutch, where it is in line with the expected semantic 
agreement pattern, but the effect is also observed in German learner Dutch and L1 
German. Finally, the marginal effect of Antecedent (masc/neut) in the Belgian 
Dutch and the German group is due to the overall less positive judgements for 
neuter antecedents. This effect is carried by the low acceptability of the combina- 
tion of a neuter antecedent and the pronoun hij/er in these two groups. 
Examining some of the patterns in more detail, the combinations of neuter 
nouns with masculine pronouns (hij/er ‘he’), first, yielded the least approval, 
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which is in line with the fact that they are neither the outcome of syntactic, nor 
of semantic agreement. Still, much more instances of this type are judged gram- 
matical in varieties of Dutch than of the ungrammatical fillers. This holds both 
for L1 speakers (where ungrammatical fillers on average get some 10% positive 
evaluations vs. 28% (B) or 37% (NL) of neut_hij/er combinations) and for German 
L2 learners of Dutch (with 30% of ungrammatical fillers approved of vs. 48% of 
neut_hij/er combinations). Hence, in addition to resemanticisation, the answers 
of the Dutch L1 speakers, and likely also those of German L2 learners, can be 
interpreted as indications of uncertainty regarding gender agreement, which 
cannot be explained on semantic grounds. This uncertainty is not found in L1 
German. Second, the combinations with masculine nouns are particularly insight- 
ful to determine the alternation between syntactic and semantic agreement. In 
both Netherlandic and Belgian Dutch, acceptance of semantic agreement (around 
75%) is higher than of syntactic agreement (around 60%). Although Belgian Dutch 
tends towards syntactic agreement slightly more than Netherlandic Dutch, the 
difference between both varieties is smaller than could have been expected from 
the literature. In contrast, in L1 German syntactic agreement (masc_hij/er) is by 
far the preferred option, despite a non-negligible acceptance of semantic agree- 
ment (masc_het/es) of 26%. The German L2 learners, finally, show a preference for 
syntactic agreement, which can be considered a transfer effect. They also show a 
high acceptance of semantically motivated neuter pronouns for masculine mass 
nouns, however, which could, in principle, both be transferred from their L1 or 
learned. Even though the 65% proportion of semantic agreement exceeds the 
proportion of 26% found in the L1 German group, Urbanek et al. (2017) argue that 
semantic agreement in German L2 Dutch is transferred, since the pattern does not 
become stronger in more proficient learners. 

Regarding the participants’ reaction times (RT), the general expectation is 
that a low acceptance correlates with slower RTs (Hopp 2007: 238). However, the 
availability of both syntactic and semantic agreement may already impact RTs, in 
that computing the outcome in such a complex system may require additional pro- 
cessing effort. It can be hypothesized that processing will be faster when the gram- 
matical principles involved yield the same outcome than in cases of conflict. In 
Experiment 1, syntactic and semantic agreement yield matching outcomes for neu- 
ter nouns (with het/es ‘it’ being grammatical and hij/er ‘he’ ungrammatical) and 
mismatching outcomes for masculine gender nouns (with syntactic agreement 
yielding hij/er ‘he’ and semantic agreement het/es ‘it’, given that all nouns involved 
are mass nouns). The results are displayed in Figure 2, which orders data per inves- 
tigated variety to highlight intra-group differences. In contrast with Figure 1, a first 
glance reveals few tendencies holding across the board, apart from the fact that both 
in native and non-native Dutch, the neut_het/es-condition yields the fastest RTs. 
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Fig. 2: Reaction times (milliseconds) on syntactic vs. semantic agreement 


Again, there are substantial differences between the varieties involved. The data 
were analysed with a linear mixed effect model with the judging time as dependent 
variable, random intercepts for participants and items, and ‘Group’, ‘Pronoun’, 
‘Antecedent (masc/neut)’ as well as all two- and three-way interactions as predic- 
tors. As this analysis yielded several significant two-way and three-way inter- 
actions, we conducted additional separate analyses in the following. In a first 
step, we focused on potential overall effects of Group on the decision times. As 
non-native speakers plausibly will show slower reaction times due to their sta- 
tus as non-native speakers, this group was excluded from this analysis. Linear 
mixed effect models with Group (Netherlandic Dutch, Belgian Dutch, German) as 
predictor revealed a marginally significant difference in overall decision time be- 
tween Netherlandic and Belgian Dutch (z=1.92, p=.06), a significant difference 
between Netherlandic Dutch and German (z=.68, p <.01), and no significant differ- 
ence between Belgian Dutch and German (z=0.88, ns). These differences reflect 
the fact that the Netherlandic participants made their decisions more slowly than 
the German participants, with the Belgian participants situated between these 
two groups.’ This may be interpreted to indicate that a system in which semantic 


5 To interpret the RTs of the Belgian participants properly, it should be pointed out that these 
were recruited from a pool of experienced test persons, whereas the Netherlandic and German 
participants were novices. This may have affected RTs. Note that for the second experiment, 
differences between RTs obtained from Belgian and Netherlandic participants are much smaller. 
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rules have more weight could lead to slower processing of agreement than a sys- 
tem in which syntactic rules are predominant. 

In a subsequent step, we focused on effects of the two experimental factors 
and their interaction for subset models for each group, as done above for the 
offline decisions. The results are summarized in Table 3. These results reveal a 
heterogeneous picture, with subtle differences, but also common tendencies 
among the groups. 


Table 3: Mixed effect modelling of reaction times (RT) from Experiment 1, in relation to 
‘Antecedent (masc/neut)’, ‘Pronoun’ and their interaction, for four participant groups 


Antecedent (masc/neut) Pronoun Antecedent * Pronoun 

Dutch (NL) ns. z=2.35 ns. 
p<.05 * 

Dutch (B) z=4.42 z=1.77 ns. 

p<.001 *** p=.08 
L2 Dutch n.s. n.s. z= 2.96 
(L1 German) p<.01 ** 
German z=2.59 z=2.51 n.s. 

p<.05 * p<.05 * 


Across the board, RTs with neuter nouns are shorter than with masculine nouns, 
leading to a significant effect in the Belgian Dutch and the German group. The 
effect is most strongly observed in Belgian Dutch, where it can be related to the fact 
that syntactic and semantic agreement yield conflicting outcomes for masculine 
mass nouns. That the effect is not found in Netherlandic Dutch is due to the strik- 
ingly slow RTs for neut_hij/er, causing an overall asymmetry between RTs for neu- 
ter and masculine pronouns. An opposite effect for Pronoun is found in German, 
which shows slower RTs for combinations with neuter es ‘it’ than with masculine 
pronouns, and especially for masc_het/es. Since this relates to a non-marginal 
acceptance of masc_het/es in comparison to neut_hij/er, it is possibly the result of 
semantics interfering with grammatical agreement. 

The slow RTs for masc_hij/er illustrate that grammatical agreement is no 
longer the most expected option for L1 speakers of Dutch. The German learners, 
in contrast, provide the clearest evidence for a mainly grammatically dominated 
agreement system. In this group, a mismatch between the gender of the anteced- 
ent and the gender of the pronoun consistently led to longer reaction times, 
yielding a significant interaction between the two factors in this group. That such 
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an effect is not visible in L1 German is due to the unexpectedly fast decision times 
for the neut_hij/er condition. It is possible that the ungrammaticality of this 
structure was so striking for the L1 German group that it led to particularly fast 
rather than particularly slow decision times. Whether and under which conditions 
such a pattern surfaces could be further investigated in future studies. 


4.2 Experiment 2: preservation of feminine gender 


Experiment 2is geared towards testing the resilience of feminine gender. Figure 3 
shows the proportions in which combinations of both masculine and feminine 
nouns with masculine and feminine pronouns are accepted. German is expected 
to have maintained a clear distinction between masculine and feminine gender, 
unlike Netherlandic varieties of Dutch, which have collapsed masculine and 
feminine gender. Such a tendency towards ‘masculinisation’ is also observed in 
Belgian Dutch production data (Geeraerts 1992), but has not reached completion 
there. Figure 3 confirms that the masculine-feminine distinction is still clear-cut 
in German, whereas it has blurred in Dutch, including German learner Dutch. 
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Fig. 3: Grammaticality judgements on syntactic agreement with masculine vs. feminine gender 


As was done for Experiment 1, the data were analysed with generalized linear 
mixed effect models. A complex model involving the three-way interaction between 
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‘Group’, ‘Antecedent (masc/fem)’ and ‘Pronoun’ did not converge. Given that as in 
Experiment 1, we had no hypotheses regarding a main effect of Group, we con- 
ducted subset analyses for each of the four Groups. The results are summarized in 
Table 4. 


Table 4: Experiment 2: Mixed effect modelling of factors ‘Antecedent’, ‘Pronoun’ and their 
interaction, for four participant groups 


Antecedent (masc/fem) Pronoun Antecedent * Pronoun 

Dutch (NL) n.s. z=7.59 n.s. 

p<.001 *** 
Dutch (B) n.s. z=-6.36 z=-3.66 

p<.001 *** p<.001 *** 
L2 Dutch z=2.30 z=5.12 z=-12.16 
(L1 German) p<.05 * p<.001 *** p<.001 *** 
German n.s. n.s. z=-11.38 

p<.001 *** 


These results show that the Netherlandic Dutch participants are the only ones 
who do not take the masculine-feminine distinction into account at all, as 
reflected in the fact that it is the only group in which there is no interaction 
between the two experimental factors. Instead, there is a main effect of Pronoun, 
which is due to the fact that hij ‘he’ is judged as acceptable in about 20% more of 
the cases than ze ‘she’, irrespective of the grammatical gender of the antecedent. 
This confirms the process of masculinisation for Netherlandic Dutch that was 
mentioned above. In the three other groups, there is an interaction between the 
two experimental factors, due to the fact that the pronouns hij/er ‘he’ and ze/sie 
‘she’ are judged differentially depending on the grammatical gender of the 
antecedent, thus reflecting the masculine-feminine distinction. This interaction 
differs in strength in the three groups, however. As expected, it is clearest in the 
L1 German group, where Figure 3 shows that more than 90% of masculine pro- 
nouns are accepted with masculine nouns, and a similar proportion of feminine 
pronouns with feminine nouns. Use of a non-agreeing pronoun is judged 
ungrammatical (the scores of <10% are comparable to those of ungrammatical 
fillers). The absence of a main effect of Pronoun demonstrates that both pronom- 
inal forms are accepted to similar degrees. In the Belgian speakers of Dutch, there 
is only weak evidence of a distinction between masculine and feminine gender. 
While the interaction between Pronoun and Antecedent (masc/fem) plausibly 
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reflects some knowledge of grammatical gender, the preference of ze ‘she’ over 
hij ‘he’ is more general, since it is also observed in masculine nouns (where it 
amounts to a mere 7% difference, which is much less than 25% for feminine 
nouns), and yields a reverse effect for Pronoun in comparison to Netherlandic 
Dutch. Finally, in the German learners of Dutch, the interaction between the 
factors ‘Antecedent (masc/fem)’ and ‘Pronoun’ reaches the same strength as in 
L1 German, suggesting that grammatical gender is transferred from the L1 to the 
L2, and that it is the dominant factor of influence on the judgements. In addition, 
however, the learners show masculinisation: they accept hij ‘he’ with feminine 
antecedents much more often (i.e. 27% more) than ze ‘she’ with masculine ante- 
cedents, yielding a main effect of pronoun. As there are no signs of a similar 
tendency in the L1 German data, this is probably knowledge that these learners 
have acquired in the target language. 

A separate analysis was run for the variable ‘Frequency’. Since it may be 
hypothesized that the lexical gender of more frequent nouns has a higher chance 
to be acquired, high frequency is expected to hamper masculinisation, at least in 
varieties marking the masculine-feminine distinction. In general, the attested 
frequency effects link high frequency to resilience to change, as expected, but the 
effects appear relatively unsystematically and are weak. The Belgian Dutch data 
reveal a marginally significant interaction between the factors Antecedent 
(masc/fem) and Frequency (z=1.94, p=.05), which is predominantly carried by 
the high acceptance of the fem_hij condition for less frequent nouns. In the 
learner group, Experiment 2 yielded a significant three-way-interaction between 
Frequency, Antecedent (masc/fem) and Pronoun (z=2.80, p< 0.01), which is 
mainly due to the higher acceptance of hij ‘he’ with the less frequent feminine 
antecedents, and the higher acceptance of ze ‘she’ with the more frequent femi- 
nine antecedents. 

The overall picture thus is one of a continuum, in which the Netherlandic 
Dutch group and the German group represent two extremes, with no knowledge 
of grammatical gender in one group, and no baseline preference for one of the 
two pronouns in the other group. The German learners of Dutch show an inter- 
mediate position, with a moderate degree of masculinisation, as does the Bel- 
gian Dutch group, which reveals an overall preference for feminine ze ‘she’. To 
our knowledge, this overall preference for ze ‘she’ has not been found in other 
studies, and contrasts rather sharply with spontaneous production data. For 
masculine inanimates in the Belgian part of the Spoken Dutch Corpus, De Vos 
(2014: 55) calculates figures of 62% masculine vs. 2% feminine pronouns; for 
feminine nouns figures are found of 43% feminine and 6% masculine pronouns, 
respectively. Rather than the preference for ze ‘she’ oberved in our Experiment 2, 
then, spontaneous production data show a fairly resilient masculine-feminine 
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distinction, and most deviations from grammatical gender in spontaneous 
speech can be analysed as semantically motivated instances of het ‘it’. Still, 
there appears to be some asymmetry in the production data, too, in that femi- 
nine gender triggers more semantically motivated het ‘it’, and some masculini- 
sation is observed (6% of pronominal references to feminines are with mascu- 
line pronouns). 

When comparing the overall acceptance rates from Experiment 2 to previous 
production studies, it becomes evident that in the Netherlandic data, too, a 
remarkable tolerance is observed towards combinations that are rare in sponta- 
neous production, in particular combinations with ze ‘she’. Thus, Audring (2009a: 
96) finds feminine pronouns only for animate reference in her Netherlandic data, 
in contrast to their acceptance ratio of around 50% (51% for masculines and 
49% for feminines) in Figure 3. While such a high acceptance may partly result 
from exposure to alternative variants (for instance because Belgian Dutch 
shows masculinisation, too, and Netherlandic speakers may be familiar with 
three-gender varieties of Dutch), the remarkable discrepancies with usage data 
also point towards a general uncertainty regarding gender agreement. This 
uncertainty may be more visible in the current paradigm, where participants 
cannot avoid judgements for forms of which they feel unsure, whereas in spon- 
taneous production, gender-marked pronouns are indeed avoided (Audring/ 
Booij 2009). In combination with the so-called acquiescence-bias (or ‘yes’-bias 
effect), this may explain a substantial proportion of yes-answers (cf. Sabourin 
et al. 2006: 17). A comparison with Experiment 1, however, allows the generalisa- 
tion that the distinction between common (i.e., masculine and feminine de-words) 
and neuter nouns (het-words) is much more solid in Dutch than distinctions 
within common gender. 

With respect to RTs, the expectation that ungrammaticality correlates with 
longer RTs (Hopp 2007: 238) often does not allow strong predictions, in that differ- 
ences in acceptability between the conditions in Experiment 2 are very subtle for 
most varieties involved. In addition, even in L1-German, which is the only variety 
with clearly ungrammatical combinations in the experiment, RT differences 
remain limited. The descriptive results are presented in Figure 4. 

A complex statistical model involving all factors yielded several two-way as 
well as a marginally significant three-way interaction. As for Experiment 1, we 
thus proceeded to subset analyses. In a first step, we tested for a main effect of 
Group on Judgement times, again excluding the non-native speakers. These ana- 
lyses revealed no significant difference between the Netherlandic and Belgian 
Dutch speakers, a marginal difference between the data for Belgian Dutch and for 
German (z=-1.91, p=.06), and a significant difference between Netherlandic Dutch 
and German (z=3.34, p<.01). These results correlate clearly with the resilience of 
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the three-gender system. The consistently faster RTs for German seem to confirm 
that asystem with a strong propensity towards grammatical agreement allows for 
faster processing of agreement relations. 
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Fig. 4: Reaction times (milliseconds) on syntactic agreement with masculine vs. feminine 
gender 


Turning to potential effects of the factors on RTs in the four groups, there were no 
significant effects in any ofthe three L1 groups. While in the two Dutch-speaking 
groups, this may be taken as a reflection of rather subtle differences, it probably 
reflects particularly striking and categorical differences for the L1 speakers of 
German, which led to the grammatical sentences being quickly recognized as 
grammatical, and the ungrammatical ones as ungrammatical. As for the learners, 
there was a significant main effect of Pronoun (z=2.83, p<.01), and a significant 
interaction between Pronoun and Antecedent (masc/fem) (z=2.48, p<.05) (see 
also Urbanek et al. 2017). Both effects are probably carried by the particularly fast 
reaction times for the fem_ze/sie condition. While one should be weary of prema- 
ture generalisations, this may result from feminine ze/sie ‘she’ not being availa- 
ble for semantic agreement in the inanimate domain, and the fem_ze/sie-condi- 
tion thus being the one in which the least competition between syntactic and 
semantic agreement is observed. 
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5 Conclusions 


At the outset of this article, three hypotheses were formulated regarding the psy- 
cholinguistic status of grammatical gender in the varieties included in the 
investigation. First, syntactic agreement was expected to be the dominant 
agreement mode in German pronouns, and be more stable in Belgian Dutch 
than in Dutch from the Netherlands. German learners of Dutch are expected to 
rely on lexical gender, which means use syntactic agreement, more than on 
semantics. The expectation that German speakers and German learners of Dutch 
predominantly rely on syntactic agreement was borne out, whereas differences 
between the resilience of syntactic agreement in Netherlandic and Belgian Dutch 
were more subtle than could have been expected on the basis of the literature 
on spontaneous production. Second, German has a stable three-gender system, 
whereas feminine gender is vulnerable in Dutch: Netherlandic Dutch yields 
similar results for masculine and feminine nouns, which are thus collapsed into 
the category of common gender; Belgian Dutch maintains the distinction but 
hardly in a robust fashion. This is not to say that there are no differences between 
Belgian and Netherlandic Dutch, however, in particular with respect to the 
degree with which hij ‘he’ vs. ze ‘she’ is accepted by default for common gender 
antecedents. Third, semantic agreement is more strongly observed in Netherlan- 
dic Dutch than in Belgian Dutch although here too, the difference is far from 
spectacular. German shows a non-negligible amount of semantic agreement as 
well, and in particular allows combinations of lowly individuated non-neuter 
nouns and neuter pronouns. 

With respect to the ‘entrenchment accounts’ of resemanticisation (De Vogelaer/ 
De Sutter 2011; De Vos/De Vogelaer 2011; Kraaikamp 2017), our experiments show 
that the rise of semantic agreement in Dutch relates to an increased uncertainty 
with respect to grammatical gender, yielding highly mixed answers for several 
conditions in our experiments, and high RTs across the board. Our results are 
complementary to studies on usage documenting highly variable pronominal gen- 
der in Dutch and slow acquisition processes, and more directly link such findings 
to linguistic cognition. They support a scenario assuming a causal link between 
linguistic uncertainty and change, in that semantic agreement can be considered 
a default option that is becoming more important as the knowledge of the grammat- 
ical gender system is affected by processes of deflection rendering invisible the 
distinction between masculine and feminine gender. This seems to be corroborated 
by a number of frequency effects in the data, which all link high frequency to resil- 
ience to change, as expected in entrenchment accounts. However, the nouns in 
the investigation yield relatively unsystematic and weak effects, which may be 
due to the fact that they only cover a limited frequency range. 
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In general, the overall faster RTs in L1 German may be interpreted as an 
indication that an agreement system with a strong propensity towards syntactic 
agreement allows for faster processing of agreement relations than systems in 
which semantic agreement plays a larger role. This would be consistent with the 
alleged function of grammatical gender as a device helping to keep track of 
reference across discourse (see Contini-Morava/Kilarski 2013 for discussion). It 
is unclear, however, to what extent such a generalisation would extend beyond 
the Germanic varieties included in the present investigation and hold for other 
languages where syntactic and semantic agreement potentially conflict, which 
is, typologically speaking, common in gender systems (cf. Corbett 2013). In the 
varieties of Dutch discussed here, processes of deflection and the covert nature 
of gender assignment have rendered grammatical gender vulnerable, but have 
not obliterated it. It remains an open question what would be the impact if seman- 
tic agreement assumed an even more prominent status than it presently has in 
Dutch. A language such as English, for instance, has by and large lost its gram- 
matical gender in favour of semantically driven pronominal reference. Mills (1986: 
91f.) shows that children acquire such a semantic system of pronominalisation 
more slowly than a German-style syntactic agreement. This may indicate that 
semantic agreement can indeed be cognitively challenging. It remains to be tested, 
however, whether this slow acquisition corresponds to higher RTs in experiments 
like the ones carried out in this study. 
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Subtle differences, rigorous implications: 
German and Dutch representation of 
tense-aspect features in SLA research 

of Spanish 


Abstract: This article presents original evidence for an L1-effect in SLA by com- 
paring empirical studies on German and Dutch learners of L2 Spanish (written 
production). In Spanish, grammatical aspect plays a far more prominent role 
(perfectivity is grammaticalized) and thus both learner groups are faced with 
new linguistic features. In both cases, L1-like performance is not achieved. How- 
ever, the ways learners deal with this aspectual phenomenon in their written 
production is completely different: German learners base their decision on tem- 
poral markers and trigger words, whereas Dutch learners consider inherent ver- 
bal aspect. We explain this contrast by analysing small differences between the 
involved L1 systems: only Dutch learners depart from a system with basic aspec- 
tual notions. 


Zusammenfassung: Dieser Artikel präsentiert neuartige Evidenzen für einen 
L1-Effekt im Zweitspracherwerb, indem empirische Studien zu deutschen und 
niederländischen Lernenden des Spanischen als L2 verglichen werden (schrift- 
liche Produktionsdaten). Im Spanischen spielt grammatischer Aspekt eine 
prominentere Rolle (Perfektivität ist grammatikalisiert), wodurch beide Lern- 
er-Gruppen mit neuen Merkmalen konfrontiert werden. In beiden Fällen wird 
keine muttersprachliche Performanz beobachtet. Jedoch unterscheiden sich die 
Weisen, in denen mit aspektuellen Phänomenen umgegangen wird: Deutsche 
Lernende gründen ihre Entscheidungen auf Signalwörter, während niederlän- 
dische Lernende den inhärenten Verbalaspekt berücksichtigen. Wir erklären 
diesen Kontrast durch eine Analyse von kleinen Unterschieden in den L1-Syste- 
men: Nur niederländische Lerner starten von einem System mit grundlegenden 
aspektuellen Konzepten. 


1 Introduction 


Probably the most distinguishing property of adult L2 learners, besides their age 
of onset, is the fact that they already have a fully developed language system, 


@ Open Access. © 2020 Gonzälez/Diaubalick, published by De Gruyter. JEAN This work is 
licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. 
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acquired during their first language (L1) acquisition. One obvious question is to 
what extent, if at all, the L1 system influences the L2 acquisition process (see e.g. 
White 2003 for an overview). In the research literature, the notion of L1-effects is 
controversial. Whereas some researchers consider the L1 as main source for 
learning difficulties, others claim that it is the complexity of the linguistic prop- 
erty to be acquired which decisively affects the process.’ 

In this chapter, we want to contribute to that discussion by synthesizing the 
work of several previous studies which, when contrasted, reveal intriguing differ- 
ences between several groups of learners. For this purpose, we will focus on Dutch 
and German-speaking learners of Spanish as L2. The testing ground consists of the 
tense-aspect-systems of these languages. 

The conclusion drawn from this comparison affects the description of the 
languages itself. Although an immediate comparison of the verb systems of Dutch 
and German only reveals minor differences and leaves substantial uncertainties 
in several respects, these differences have a great effect on how interlanguages 
look. This supports two claims made in this chapter: firstly, there is a clear L1- 
effect which is manifested even if the L1s in question do not seem to rigorously 
differ from each other. Secondly, the differences between the tense systems in 
German and Dutch are indeed present and sharply distinguish the verb systems 
from each other. 

Generally, the mastery of any Romance tense-aspect system is known to be 
highly challenging for Germanic speakers. However, although all Germanic learn- 
ers show difficulties when producing the targeted Romance forms, there is a signif- 
icant difference in how they try to compensate for them, i.e. which type of learning 
strategies are (consciously or implicitly) applied to overcome a possible lack of 
knowledge (see e.g. Cadierno 2000). As results from previous studies show, Dutch 
learners use aspectual distinctions in the L2 but commit errors when selecting the 
aspectual level (inherent instead of grammatical). German learners, in contrast, 
do not consider inherent aspectual properties when selecting a form, but rely on 
elements of the linguistic surface such as adverbs or other lexical elements. 

Based on a review of these results, we aim to derive an important implication 
for linguistic analysis and description, as the comparison between the inter- 
languages of L2 learners will give us insight into the differences between the L1 
characteristics. 


1 We owe many thanks to Henk Verkuyl and Geert Booij for their valuable comments on earlier 
versions, and also to Jill Jeffery for the last revisions. We profited very much from anonymous 
reviews and editor comments. 
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The organization of our chapter follows the outlined argumentation: in sec- 
tion 2, we describe the tense-aspect systems of German, Dutch and Spanish, 
focussing in detail on the (partially subtle) differences between the two Germanic 
languages and the challenges presented in the Romance language. In the next 
paragraph, we will summarize existing studies on different learner groups, 
according to their L1 (section 3). New in this context is the cross-linguistic contrast 
of these studies, leading to a meta-comparison of the results that are significantly 
different from each other. The chapter closes with a general discussion and 
conclusion. 

The main point consists of the observation that the manifested differences 
between the learners prove that the German and the Dutch grammar clearly have 
their own intrinsic temporal system. Although these differences seem subtle from a 
perspective of grammatical description, they lead to very different outcomes in SLA. 


2 Aspectual systems 


2.1 Inherent Aspect 


When talking about aspectual information, two levels can be distinguished: the 
inherent level and the grammatical one. However, both levels share certain proper- 
ties and may even interact with each other, which has led to different proposals 
regarding how to categorize the phenomena.” The most relevant argument for the 
present chapter is that the expression of aspect as a grammatical contrast is sub- 
ject to cross-linguistic variation. Here, inherent aspect is clearly different from 
grammatical aspect, as it refers to a universal property of language which allows 
to categorize verb predicates into different classes according to their semantic 


2 We choose the term inherent because the notion of lexical aspect is misleading: we want to talk 
about the predication rather than the verb itself. We thus approach the topic at phrase level to 
experience an inherent boundedness interpretation (compare the tenseless predicates She read a 
book vs. She read books vs. No one read a book, which, although featuring the same verb, differ in 
their inherent aspect, as will be shown later in this chapter). 

3 Within the generativist framework, some researchers highlight the fact of similarity and propose 
that inherent and grammatical aspectual levels are coded together within one aspectual phrase 
(Tsimpli/Papadopoulou 2009). Others hypothesize that those two levels are completely separated 
from each other (e.g. Diaubalick/Guijarro-Fuentes 2016; Rothman 2008). 
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content. Crucially, this is possible without a concrete grammatical context 
(Comrie 1976)*. 

Although most researchers agree on the theoretical possibility of categorizing 
inherent aspect into various disjoint classes, they disagree as to how to define 
such classes. The initiator of this type of semantic differentiation is Vendler (1957). 
Thanks to his four-way partition, other researchers have been able to refine the 
division and simplify it into three- or two-way partitions. A short description of 
some of these approaches follows. 

Vendler (1957) classified verbs into states, activities, accomplishments and 
achievements. State verbs are characterized by the lack of dynamicity and are 
thus stable over a longer time period (be, love, hate). Activity verbs, in contrast, 
are dynamic and require the addition of energy (read, walk, swim). In contrast to 
accomplishments and achievements, they carry no inherent point of termination 
and can be extended or shortened. Finally, the distinction between accomplish- 
ment (read a book, walk a mile) and achievement verbs (find, arrive, die) lies in 
the fact that the latter are perceived as punctual. 

Building upon these concepts, Comrie (1976: 41-51) shows that three features 
underlie this classification: stativity, telicity and punctuality. Whereas the defini- 
tions of stative (contrasting with dynamic) and punctual verbs (contrasting with 
durative ones) are merely formal reflections of the corresponding intuitions, the 
notion of telicity needs further clarification: this concept refers to the culmination 
of actions and is to be understood as carrying an inherent end-point. Oppositely, 
the end-point of atelic events is arbitrary. Accomplishment and achievement 
verbs are considered telic whereas states and activity verbs are understood as 
atelic. As this notion is a priori little intuitive, several tests have been developed 
to determine the (a)telicity of a predicate. For instance, Giorgi/Pianesi (1997) state 
that only telic verbs allow the combination with in-adverbials (e.g. ‘to read a book 
in one hour’), atelic verbs combine with for-adverbials (e.g. ‘to read for hours’). 

Such tests are not unproblematic, as there are contexts that allow both types 
but with different readings (see Salaberry 2008; Shirai 2013, among others). This 
problem can be understood as symptom ofan unclear definition: inherent aspect 


4 This does not mean that inherent and grammatical aspect are completely independent. Once 
there is a concrete grammatical context, both aspectual levels can interact with each other. 
Examples of coercion are given in the section treating the Spanish verb system below. 

5 It is unclear what is meant with verb classes. It may be that four types of predications are 
distinguished, but a focus on aspectual properties of the verb itself could also be meant, when 
discussing the case of accomplishments like ‘buy’ or achievements like ‘reach’. In the latter 
case, there are some inconsistencies in the use of the notion of verb, because reach cannot occur 
without a complement (see Verkuyl 1993 for a discussion). 
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is to be determined locally, nevertheless in most languages accomplishment 
verbs consist of a verb and an object. When applying the definition to verbs 
only, punctuality and telicity turn out to coincide. It remains unclear how many 
elements within a sentence must be considered. Another problem lies in the 
terminology itself, as the use of the term telic presupposes (via Vendler’s appeal 
to Aristotle) that each motion has an inherent goal (telos). This Aristotelian notion 
is quite dubious because it is too closely connected with the idea of intentionality. 
Expressions like ‘lose a wallet’, ‘resume her seat’, or ‘she came back from the win- 
dow’ are quite hard to connect with the Aristotelian idea of an inherent goal on its 
way to completion (Verkuyl 1993). In fact, Dowty (1979) showed very early on that 
this conceptualization of events leads to philosophical paradoxes. If we say, for 
instance, that “John was drawing a circle” (ibid.: 133), and construct a context in 
which the action was started but not completed, how can we then judge if the 
sentence is true? Such deliberations are the reason why a more neutral term is to 
be preferred, and more simplified classifications have been proposed. 

One such classification is found in the proposal by Moens/Steedman (1988), 
who argue that predicates (verb+internal argument) are partitioned between 
those pertaining to events and those pertaining to states. Both classes are further 
divided in subclasses, but in general terms the ontology can be summarised as a 
distinction between dynamic and non-dynamic predicates. Whereas events are 
defined as “happenings with defined beginnings and ends” (Moens/Steedman 
1988: 17), states do not have these properties. For instance, ‘climbing’ as well as 
‘climbing to the top’ are events, whereas ‘being at the top’ is a state. As this example 
shows, events and states can be interrelated (in this case, the state is a consequence 
of the event). Nonetheless, since the property of telicity is being avoided, a problem 
arises with the properties of beginning and end: there is no clear distinction 
between arbitrary and determined points. 

Due to limitations of previous frameworks, we will work with the bipartition 
proposed by Verkuyl (1993) who defines the concepts of terminativity and dura- 
tivity. These terms unite two advantages, because, although they maintain the 
idea that events are characterised by termination points, they make clear that they 
apply to whole phrases or verbal predicates, not only to verbs, thus rendering the 
telic-atelic distinction unnecessary. 

According to Verkuyl (1993, 1999)°, and as summarised for L2 research pur- 
poses in Gonzalez (2003, 2008), the terminativity of a verb phrase is a composi- 
tional function of the properties of the verb and its arguments. The lexical semantic 
information given by the verb combines with structural and lexical information 


6 And many others, see Shirai (2013) for a discussion. 
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given by the arguments to express whether the situation has, or lacks, a natural 
inherent endpoint (terminative versus durative’ clauses). This is why this combi- 
nation is called predicational aspect, as it not only depends on the semantics of 
the verb, but also on the semantics of the verbs and its arguments (Verkuyl 1999). 
The following examples show the clear difference between terminative and 
durative predications. This bi-partition occurs before any temporal information 
is given to the sentence as in the predications below, where inflection is not 
expressed yet. 


(1) read a book (terminative) 
(2) read newspapers (durative) 
(3) love that book (durative) 


For the purposes of this chapter, two claims shared by the competing classifications 
are relevant: firstly, verb predicates have diverse aspectual properties and thus 
behave differently regarding boundedness and termination. Secondly, it is not the 
grammatical structure that determines these inherent aspectual properties, but 
the lexical features of the elements contained in the verb phrase and in a full 
(tenseless) predication. The grammar (i.e., tense morphology) is applied to the 
tenseless verb phrase in a further step and contributes additional information to 
the aspectual interpretation. This leads us to the concept of grammatical aspect, 
which is presented in the next section. 


2.2 Grammatical Aspect 
2.2.1 Generalities 


Broadly speaking, grammatical aspect concerns the temporal boundedness of a 
given context, and thus can only be determined when a clear speech context is 
known. Different from tense, grammatical aspect is not a deictic category, and 
can be determined without referring to the moment of speech (Comrie 1976). 
Traditionally, one finds a division between perfective and imperfective aspect 
within grammatical aspect, which is coded in Romance languages in their past 
tense forms (where tense and aspect are expressed simultaneously). According to 


7 It is important to note that the term durative used here carries a different meaning than the 
same term applied by Comrie (1976) as stated above. In the context here, durativity is not defined 
as a contrast to punctuality, but indicates an event without inherently defined termination point. 
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Dominguez/Arche/Myles (2017), four basic notions can be distinguished: pro- 
gressivity, perfectivity, continuity and habituality. Progressivity, continuity and 
habituality have been understood as readings of the imperfective realm. There are 
new theoretical developments pointing to a new understanding of the progressive, 
standing outside the grammatical aspect spectrum (Gonzalez/Verkuyl 2017). 
Gonzälez/Verkuyl propose that the progressive use should be formally eliminated 
from the traditional readings of the imperfective (ibid.: 133). What is important 
to reiterate for the argumentation of this chapter is that not all languages mark 
grammatical aspect by the same means. 


2.2.2 Spanish TA (Tense/Aspect) System 


As a Romance language, Spanish requires the marking of grammatical aspect in 
its past tenses (Zagona 2007; Gonzalez 2003, 2013). In the following description 
we will focus on three main past tense forms: Pretérito Perfecto Compuesto 
(Present Perfect), Pretérito Perfecto Simple (Preterit) and Pretérito Imperfecto 
(Imperfect). The opposition between Preterit and Imperfect represents the men- 
tioned perfectivity/imperfectivity-contrast. However, the Present Perfect also 
plays a role in aspectual distinctions.® 


Present Perfect: 

(4) He leido un libro. 
I-have read.PARTICIPLE one book 
‘I have read a book.’ 


Preterit: 

(5) Lei un libro. 
I-read.PRET one book 
‘I read a book.’ 


Imperfect: 
(6) Leia un libro. 
I-read.ImPp one book. 
‘I was reading/read/used to read a book.’ 


8 This section is an adaptation of a similar section at Gonzälez/Verkuyl (2017); Gonzälez/Quin- 
tana Hernandez (2018). 
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The Present Perfect is mostly used in hodiernal contexts, where it expresses ante- 
riority with respect to the present, and focuses on the result ofthe event (see (4)). 
This form is more common in Peninsular Spanish than in American Spanish (see 
Gonzälez/Verkuyl (2017) for a description of this variation). However, other stud- 
ies (Schwenter/Torres Cacoullos 2008) show that the Present Perfect has been 
extended for perfective uses in prehodiernal contexts in European Spanish, as in 
He comido ayer (I have eaten yesterday). Because of this variation, it is important 
to consider the Present Perfect when defining past tenses in Spanish. 

The Preterit in sentence example (5) presents the event as anterior to some 
anchoring point provided by the discourse and completely dissociated from it. It 
presents an event as a discrete whole at some specific moment in the past (perfec- 
tive aspect) and is not used in perfect contexts such as *Comi hoy (I ate today) in 
European Spanish.’ 

Finally, Spanish counts on a morphologically marked imperfective past tense 
(example (6)). The Imperfect is often taken as presenting an event in process, i.e. 
as not delimited, which implies that the difference is aspectual, not temporal 
(Garcia Fernandez 1999; Leonetti 2004). It leaves the event unspecified as to its 
completion. There are several readings related to imperfective aspect: the pro- 
gressive, the habitual and the continuous aspect (Gonzalez 2003; Dominguez/ 
Arche/Myles 2017). These related meanings all allow the Imperfect morphology”. 


(7) Laura leía el periödico en aquel instante. 
Laura read.IMP the newspaper inthat instant 
‘Laura was reading the newspaper at that moment.’ (progressive) 


(8) Laura leía el periódico todos los domingos. 
Laura read.IMP the newspaper allthe Sundays. 
‘Laura read the newspaper every Sunday.’ (habitual) 


(9) Laura leía el periódico aqueldomingo (y todavía 
Laura read.IMP the newspaper that Sunday (and still 
lo lee). 
it read.PRES) 
‘Laura read the newspaper that Sunday (and is still reading it). 
(continuous) 


9 However, this use is fully accepted in Latin American Spanish (Rojo/Veiga 1999). 

10 González/Verkuyl (2017) defend the idea that the progressive is not a reading of the imper- 
fective. Yet, for the purposes of this chapter we adhere with the more traditional understanding 
of imperfective readings. 
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The Imperfect encompasses all three readings and thus can be said to under- 
specify which reading is the most appropriate. The concrete interpretation in a 
given situation does thus not only depend on the verb form, but also on the 
sentence context. 


2.2.3 Germanic systems 


All Germanic languages share inherent aspectual values.” In contrast to Romance 
languages, they contain fewer or no instances of grammatical aspect marking. 
Before describing the aspectual differences between Dutch and German, there 
are some interesting generalizations to be made: First, throughout the scientific 
literature on aspect, the most studied Germanic language is English (Comajoan 
2014), and second, although all Germanic languages differ significantly from 
Romance languages, they do not present a homogeneous group. 

At first sight, Dutch and German tense-aspect systems seem rather similar 
(Borik/Gonzalez/Verkuyl 2004; ten Cate 2004). In Table 1, four temporal-aspec- 
tual operators are used: PRES for marking of the present tense, PAST for marking 
of a past event. POST stands for an event posterior to a reference time, PERF 
stands for a completion of the event. As we can see German and Dutch have for- 
mal equivalences for all relevant tense forms. 


Table 1: Tense operators for Dutch and German (adapted from Borik/Gonzälez/Verkuyl 2004) 


Present Past 

PRES PAST 

D: Ik schrijf een brief. D: Ik schreef een brief. 

G: Ich schreibe einen Brief. G: Ich schrieb einen Brief. 

‘| write a letter’ ‘| wrote a letter’ 

PRES(POST) PAST(POST) 

D: Ik zal een brief schrijven. D: Ik zou een brief schrijven. 

G: Ich werde einen Brief schreiben. G: Ich würde einen Brief schreiben.” 
‘| will write a letter’ ‘| would write a letter’ 


11 However, these features can be organised in different language-specific ways within the 
lexicon. 

12 The German versions are presented in italics, as this form usually conveys a conditional 
reading. Only in some specific context (e.g. indirect speech) it can denote a future in the past. 
This difference does not have any relevance for the argumentation here. 
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Present Past 

PRES(PERF) PAST(PERF) 

D: Ik heb een brief geschreven. D: Ik had een brief geschreven. 

G: Ich habe einen Brief geschrieben. G: Ich hatte einen Brief geschrieben. 

‘| have written a letter’ ‘| had written a letter’ 

PRES(POST)(PERF) PAST(POST)(PERF) 

D: Ik zal een brief geschreven hebben. D: Ik zou een brief geschreven hebben. 

G: Ich werde einen Brief geschrieben haben. G: Ich würde einen Brief geschrieben haben. 
‘| will have written a letter’ ‘| will have written a letter ‘ 


Although on the formal side, the verb systems appear almost entirely alike, some 
of these similarities are only superficial and do not correspond to the use of the 
forms. 


2.2.3.1 Dutch’? 

As a Germanic language, Dutch is strongly “tense-oriented” (Broekhuis/Corver/ 
Vos 2015). However, it has a few aspectual phenomena. The distinction between 
the Simple Past form and the Present Perfect form could be understood as aspec- 
tual (Borik/Gonzälez/Verkuyl 2004), as the simple past is imperfective in nature 
and the present perfect acts as both perfect and perfective, depending on the 
context. In (10) there is a simple past with an habitual (hence imperfective) read- 
ing and in (11) the perfect form is used with a perfective meaning (Van Hout 
2005): 


(10) Ik las altijd veel boeken. (Simple Past) 
I read.PasST alwaysmany books 
‘I always read many books.’ 


(11) Ik heb gisteren honderd emails gelezen. (Present Perfect) 
I have yesterday hundred emails read.PARTICIPLE 
‘Yesterday I read a hundred emails.’ 


(2) Ik heb vandaag drie kilometer gelopen. (Present Perfect) 
I have todaythree kilometres run.PARTICIPLE 
‘Today I have run three kilometres.’ 


13 This section is an adaptation of a similar section at Gonzälez/Verkuyl (2017). 
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As shown in (11) and (12), the Present Perfect can have, both a perfect and a 
perfective reading, depending on the context. In (11) the Perfect is used in a pre- 
hodiernal context (yesterday), where traditionally one would expect only a per- 
fective form.™ In (12) we find the more traditional and default use of the perfect, 
in a past situation where the temporal domain is still valid at the moment of 
speech (today). 

Moreover, Dutch has a progressive construction, as shown in (13). It is “used 
to refer to some eventuality during speech time” (Broekhuis/Corver/Vos 2015: 
151). This description is based on the progressive construction with a present 
tense auxiliary but can also be applied to its past tense counterpart. 


(3) Ik was koffie aan het drinken. 
I was coffee at the drink.INF 
‘I was drinking coffee.’ 


(14) Ik zat koffie te drinken 
I sat koffie to drink.INF 
‘I was drinking coffee’ 


Sentence (14) with the verb ‘sit’ in auxiliary position, is actually more accepted 
with the reading of progressive, or even with a habitual sense. 


2.2.3.2 German 

In German, the most important past tenses in terms of usage frequency are the 
Present Perfect and the Simple Past. Although, given their morphology, they 
seem similar to the corresponding tenses in Dutch (and even in Spanish), there 
are some clear differences in their use. In the research literature there is a debate 
as to whether these tense forms carry different aspectual features. In fact, it is 
disputed in the literature whether there is any grammatical aspect at all in German 
(see e.g. Schwenk 2012). 

Recent investigations indicate that the verb forms do not express aspectual 
contrasts but carry rather stylistic features (Heinold 2015). Generally, the Perfect 
is regarded as more colloquial and is preferred in the spoken language, whereas 
the Simple Past - sometimes also referred to as Imperfect or Preterit (see Vater 
2010 for terminological questions) — occurs in more formal contexts and is 


14 It is noteworthy that, as in other European languages such as French, Italian (Romance) or 
German (Germanic), adverbial phrases such like gisteren ‘yesterday’, referring to temporal inter- 
vals preceding speech time, are used in Present Perfect constructions. 
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reserved for written texts. What is essential for the purpose of this chapter is 
that, at least in colloquial language, an interchange of the forms does not lead to 
a change of meaning, but simply reflects another style. 

To give an example, the sentence ‘She was watching TV when she received the 
call can be translated in four different ways: 


(15) Sie sah fern, als sie den Anruf bekam. 
She watch TV-PRET when she thecall get-PRET 

(16) Sie hatferngesehen, als sie den Anruf bekommen hat. 
She watch TV-PERF, when she thecall get-PERF 

(17) Sie hatferngesehen, als sie den Anruf bekam. 
She watch TV-PERF, when she thecall get-PRET 

(18) Sie sah fern, als sie den Anruf bekommen hat. 
She watch TV-PRET when she thecall get-PERF 


Whereas (15) is a common sentence in colloquial language, (16) sounds rather 
formal. The other two sentences can be classified as somewhere in between for- 
mal and informal use. Although there are some dialectal differences regarding 
which alternative is the most preferred one, what is crucial for our analysis is 
that there are no semantic distinctions whatsoever (Heinold 2015). Contrasts 
such as perfectivity must be expressed through lexical means if the context 
requires to do so. These means can consist of the use of another verb with a 
different lexical aspect, or of adding temporal adverbs, particles or non-stand- 
ardized periphrases. Such a periphrasis, for instance, is found in the progres- 
sive form Ich bin am Lesen ‘I am reading’. However, this form is not comparable 
to the Dutch Progressive in (13) (Andersson 1989; Krause 1997) as it is region- 
ally restricted and highly stigmatized from a normative point of view (Thiel 
2008). 

In sum, it is reasonable to conclude that German verb forms have no mor- 
phological means to express aspect. This renders the German grammar, in that 
matter, significantly different from other Germanic languages, such as English 
and Dutch where a basic aspectual contrast is still available. The concurring 
past forms are only marked for tense and, although they may differ in style, are 
generally interchangeable. Interestingly, this observation extends even to aux- 
iliary verbs, such that the pluperfect Ich hatte angerufen can be expressed as Ich 
habe angerufen gehabt ‘I have had telephoned’, a form which, despite its excep- 
tional status, in the German tense system is frequently used in the spoken lan- 
guage (see Duden 2009). 
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Table 2: Summary of interlinguistic differences regarding grammatical aspect 


Spanish Consequent marking of grammatical aspect within the past tense 
Dutch Some aspectual contrasts (habituality, progressivity) 
German Grammatical aspect is not marked morphologically 


2.2.4 Comparison 


This chapter pursues the idea that theoretically motivated discussions about the 
properties of the tense-aspect system have an immediate relevance when it 
comes to L2 learning. Although the differences between German and Dutch 
might seem rather subtle, we sustain that they lead to significantly different pat- 
terns when comparing German speaking and Dutch-speaking learners of a lan- 
guage in which grammatical aspect plays a major role, such as Spanish. Our 
innovative angle is thus to show that by looking at the interlanguage of L2 learn- 
ers of different L1s, we can gain insight into how the different L1s organize gram- 
matical information. 

Undoubtedly, the Romance languages have a richer aspect system than the 
Germanic languages. If, for instance, a Spanish sentence needs a translation in 
which no information is being lost, lexical elements (adverbs, particles, etc.) 
must be used to make the aspectual contrasts explicit. In that regard, Germanic 
verb forms are underspecified (Sanchez Prieto 2011). Regarding inherent aspect, 
on the other hand, there are no major differences between the three languages 
presented here. 

Nonetheless, the presentation of the different verb systems in the sections 
above has shown that the Germanic languages do not represent a homogeneous 
group either. Whereas Dutch contains a basic aspectual notion in its tense sys- 
tem, we derive that German has no such notion at all (see Table 2). Although in 
direct translations from one language to the other this difference is in most cases 
negligible, we argue that, nevertheless, it leads to significantly distinct rep- 
resentations which, in a L2 learning context, turn out to be significant. Whereas 
a German-speaking learner of Spanish as L2 is faced with a completely new cate- 
gory (i.e. perfectivity), a learner with Dutch as L1 already has a vague idea since 
in Dutch there are perfect forms with perfective meanings and a progressive 
construction, i.e. the concept that verbal morphology can carry meaning of 
grammatical aspect is already familiar. 

An appropriate concept to formalize the subtle differences can be derived 
from micro- and macro-parameters in the generativist framework (see Kayne 
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2005). The general idea that language acquisition can be accounted for via the 
concept of parametric differences (Chomsky 1995) has changed over the years, so 
that nowadays the focus lies rather on the acquisition of features (see e.g. Hwang/ 
Lardiere 2013). For instance, in the case of the Spanish past tense forms, 
Dominguez/Arche/Myles (2017) define four features as relevant: [+perfective], 
[+continuous], [+habitual] and [+progressive]. 

An alternative conceptualization of microparameters is proposed by Roberts 
(2014) who suggests that parameters are actually organized hierarchically. Accord- 
ing to this view, the term macro-parameter is nothing more than a set of several 
micro-parameters sharing similar properties. In the context of grammatical aspect, 
this can be understood in terms of the following: on a micro-level, we can ask if a 
given aspectual feature is reflected by a grammatical marker in the given language. 
The set of all aspectual markers, then, corresponds to the macro-parameter. 
Applying this notion to the languages treated in this chapter, German differs from 
both Spanish and Dutch on a macro-parametrical level, since there is no marking 
of grammatical aspect at all. Comparing Spanish and Dutch, on the other hand, 
although both languages have grammatical aspect markers, they differ in terms 
of micro-parameters. Spanish requires the consequent marking of perfectivity, 
whereas in Dutch there is only grammaticalized expression of progressivity and a 
basic aspectual contrast in the different past tenses (see also Salaberry/Ayoun 
2005 for similar arguments on English). 

In the next section, we will review findings of empirical studies which support 
the proposed approach. 


3 Consequences for L2 Learners 


3.1 Background 


In the context of acquiring the aspectual system of Spanish as an L2, the main 
task for speakers of Germanic languages consists of considering the marking of 
(im)perfectivity” in Spanish and of understanding the consequences of the 
contrast to German forms. As argued above, Dutch and German-speaking learn- 
ers have different starting points which may affect their sensitivity to grammati- 
cal aspect as a general notion. 


15 And ina way, also the Perfect, although it is not part of the main argument presented here. 
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According to research in several branches of linguistics, great differences 
between the L1 and the L2 may hinder the acquisition of the latter, whereas 
similarities can have an accelerating effect. Within the generativist framework, 
for instance, one important approach assumes the Feature Reassembly Hypothesis 
(Hwang/Lardiere 2013). This hypothesis poses that SLA is broken down into a 
continuous task of reorganization of features, starting with L1 configuration. The 
more differences there are in how formal features are mapped to grammatical 
forms, the more difficulties arise for the learner. Consequently, the learning process 
is significantly slower in comparison to learners who start from a L1 with fewer 
differences to the target system. During the reconfiguration process, it is argued 
that a rule-based competing system can take over which is constructed consciously 
and deducted directly from pedagogical input (Rothman 2008). 

L1-effects are also discussed within the usage-based approach to L2 acquisition 
where tense-aspect phenomena are a highly investigated research subject (see 
Bardovi-Harlig 2000 for an overview). In such studies, the focus is more on how 
(i.e. in which steps) a grammatical competence is achieved than on why this 
behaviour occurs.1° 

Nonetheless, the L1 effect is not supported by all researchers. For instance, 
Ayoun/Salaberry (2008) claim that it is irrelevant for non-complex phenomena, 
and Gabriele/McClure (2011) even state that only the complexity of a given phe- 
nomenon itself, not the difference with the corresponding L1 property, deter- 
mines the degree of difficulty in acquiring it (see Dominguez/Arche/Myles 2017 
for a review). The acquisition of the tense-aspect system in Romance languages 
represents a promising testing ground for a deeper investigation of these issues, 
as it is characterized both by a high complexity and by a large cross-linguistic 
variation, as seen above. 

Many researchers (e.g. Housen 2000; Izquierdo/Collins 2008, just to name a 
few) found that in precisely such cases even the most proficient learners do not 
follow native-like patterns if their L1 differs significantly from the target language. 
More concretely, instead of choosing a verb form based on grammatical aspect, 
they rely on lexical features (more details in section 3.2). A general observation 
is that the greater the L1-L2 differences are, the more the learners rely on such 
learning strategies (Izquierdo/Collins 2008: 352). 

The dissociation of grammatical and inherent aspect turns out to be the 
main task for learners of Romance languages as L2 and has often been argued 


16 According to Rothman (2008), this is a general disadvantage in comparison to more formal 
approaches. In this chapter, we will combine several approaches without similarly rigorous 
judgements. 
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to be the main source of difficulties (Andersen 1986, 1991; Salaberry 2008). 
Importantly, most of the evidence supporting those arguments is based on 
English-speaking learners. Other Germanic languages, such as German and 
Dutch remain rather understudied. This yields opportunities for further research 
since the English tense-aspect system is not identical to one of the systems 
described in 2.1.2. McManus (2015) found that English and German-speaking 
learners of French as L2 behave very differently, proven by an experimental 
study among 75 participants with a comparable proficiency level of French: 
whereas in habitual contexts, both groups showed notable difficulties with the 
past tenses, the English-speaking group outperformed the Germans in progressive 
contexts. This result is directly relatable to the L1 of the learners, since contrary to 
German, English has a grammaticalized Progressive. McManus (2015) concludes 
that such differences between the L1 and the L2 can affect the way in which 
grammatical contrast are acquired and processed. 

In the following sections we take a similar approach. By comparing Dutch 
and German learners from previous empirical studies, a clear difference between 
the two groups is posited. We argue that the only possibility to explain the differ- 
ences in the L2 data is by considering that the Dutch and the German tense-aspect 
systems are clearly different, in other words, we are faced with a clear L1 effect. 


3.2 Previous studies 
3.2.1 Research overview 


There are only a few studies tackling the specific combination of German or Dutch 
as L1 and Spanish as L2. Known exceptions often do not focus on the specific 
Germanic languages and their properties but compare speakers of different lan- 
guages with each other to argue in favour of a general L1 effect (Diaz/Bel/Bekiou 
2008). Generally, the most studied L1 in research on the acquisition of the Span- 
ish tense-aspect system is undoubtedly English (see Comajoan 2014 for a review). 
Furthermore, when German speakers are included in studies, the most frequent 
language which is investigated is also English, then as L2 (e.g. von Stutterheim/ 
Carrol/Klein 2009). 

Thorough research on English-speaking learners of grammatical aspect in 
an L2 has brought about many specific hypotheses. For instance, according to 
the Lexical Aspect Hypothesis (Andersen 1986, 1991), which uses Vendler’s four- 
way partition, learners establish a relationship between lexical aspect and the 
grammatical form: state verbs are initially only marked with the Imperfect, 
whereas achievements appear with the Preterit. During the learning process, 
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other combinations are sequentially acquired, but the lexical aspect always 
determines the order. Another proposal is found in the Default Past Tense 
Hypothesis (Salaberry 2008), which states that beginners use only one past 
tense (most times, the Preterit) for all past events regardless of their aspectual 
features. Although both hypotheses were based on data from Anglophone learn- 
ers, their original formulation does not necessarily suggest a dependence on 
properties of English. It is hence unclear if a universality is intended, i.e., more 
research is necessary. 

As mentioned above, McManus (2015) showed that English-speaking and 
German-speaking learners behave significantly differently. We therefore sustain 
that a comparison between speakers of different Germanic L1s is necessary. Con- 
trary to McManus (2015), however, we will not focus on French” as the target 
language for the following reasons: Although the French tense-aspect-system 
presents almost the same perfectivity contrast as the Spanish one, research find- 
ings are not directly transferable. 

The Imperfect is similar in both languages (Amenös-Pons 2015), but in contrast 
to Spanish, the default past tense form for the perfective aspect in French, espe- 
cially in spoken language, is the analytical Present Perfect, whereas the synthetic 
Preterit (Passé Simple) is outmoded by a clear reduction in its uses (Labeau 2005). 
In current French, the perfectivity distinction is thus manifested in a contrast 
between a compound form and a simple form, namely between the Imperfect and 
the Present Perfect. In a L2 context, this leads to a higher vulnerability for transfer, 
because the Dutch or German-speaking learners could easily establish a connec- 
tion between the morphologically similar tense forms in their mother tongue. In 
Spanish, this connection is significantly less evident, since here the opposition is 
between two simple forms: Imperfect and Preterit.'? A transfer based on morpho- 
logical similarities thus cannot occur. 

Precisely for that reason, we are convinced that the focus on Spanish as L2 is 
important to see how transfer in the domain of tense and aspect with a Germanic 
language as L1 works. Since an orientation at the surface level is not possible, 
the learner is forced to concentrate on meaning. In the next sections we will 
show that this is indeed what happens, although neither the reported Ger- 
man-speaking learners nor the Dutch ones appear to achieve a native-like com- 
petence. In both cases, a compensating learning strategy (i.e., the explicit appli- 
cation of rule-based decisions, see Hawkins/Chan 1997) is developed to handle 


17 In fact, the combination of German and French has already been researched in more detail 
(see Rieckborn 2007). 
18 And possibly Perfect in some dialects, as an attentive reviewer pointed out. 


316 —— Paz Gonzälez/Tim Diaubalick 


the contrast. These strategies partially fulfil their compensating function and 
produce some target-like patterns. In other contexts, they lead to non-expected 
behaviour. Since the strategies are noticeably different, we will conclude that 
this observation is a direct consequence of the differences between the two sys- 
tems involved. 

Given the lack of concrete studies that feature German and Dutch speakers 
together, we will report on previous L2 findings of where both learner groups 
were analysed separately. Although some of the following has been reported by 
us elsewhere, what is new here are the conclusions drawn from the contrast 
between these studies. 


3.2.2 Findings on German as L1 


In astudy embedded in the generativist framework, Diaubalick/Guijarro-Fuentes 
(2016) tested the interpretation and production of the past tense forms by 71 
German learners with different proficiency levels of Spanish as L2 (intermediate 
to advanced). Using a Grammaticality Judgment Task in combination with a 
Sentence Completion Task, it has been shown that there was no direct transfer on 
a morphological level, i.e., the Spanish Present Perfect was not overgeneralized. 
That is, learners have successfully understood the fact that the most frequent 
forms are the synthetical ones: Imperfect and Preterit. 

However, a comparison with a control group showed that the learners 
behaved significantly differently from L1 Spanish speakers. Although for standard 
contexts (i.e. prototypical contexts), a developmental effect was visible, in more 
complex uses of the past tenses (where inherent and grammatical aspect differ), 
the data showed persisting difficulties in the learners. In such cases, an explicit 
learning mechanism became visible which had a clear compensating function: in 
cases of doubt, learners relied on temporal adverbs when choosing between one 
or the other verb form. Whereas this effect was directly visible in the production 
task, it also led to a clear effect on how items of the Grammaticality Judgement 
Task were evaluated. 

Temporal markers are often taught in courses of Spanish as foreign lan- 
guage and appear as a rule-of-thumb in textbooks (Salaberry 2008). The adver- 
bial la semana pasada ‘last week’, for instance, locates a past event in a com- 
pleted context and hence usually coappears with the Preterit. Diaubalick/ 
Guijarro-Fuentes (2016) showed that in precisely those contexts where such a 
known marker is to be combined with the non-expected form (e.g. la semana 
pasada and an Imperfect), significant differences between the learners and the 
control group arise. 
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To confirm these patterns, a subsequent study was conducted following a 
usage-based approach (Diaubalick/Guijarro-Fuentes 2017), where German 
learners are contrasted with speakers of other Lis (French, Italian, Portu- 
guese). A total of 131 non-native speakers participated in the study. The results 
show that none of the common hypotheses of the usage-based approach pre- 
sented above (Lexical Aspect Hypothesis, Default Past Tense Hypothesis) 
could be entirely confirmed. Instead, individual variables such as learning 
background must be considered among which the most prominent one was the 
learner’s L1. 


3.2.3 Findings on Dutch as L1 


Gonzalez (2003, 2013) and Gonzalez/Quintana Hernandez (2018) collected data 
on the acquisition of past tense forms by Dutch learners. In Gonzalez (2003), 
17 Dutch classroom L2 learners of Spanish following a beginner’s course took 
part in an experiment, where data were collected through standardised tests 
(filling the blanks and multiple choice). In Gonzalez/Quintana Hernandez 
(2018), 31 Dutch classroom L2 learners of Spanish following a A2 level course 
took part in another experiment, where data were collected through a written 
production task. 

There are striking differences in the results of both experiments. These can 
be summarized as follows: in both studies, it is shown that the Spanish Preterit 
was the preferred form. In those cases where the Imperfective appeared, it was 
more often with durative predications, whereas the Preterit occurred more often 
with terminative predications. There was a clear superposition of inherent aspect 
in Dutch onto the choice of past tense forms in their L2. In other words, when a 
predication was terminative, as in ‘read the letter’, the past L2 production would 
be with the Preterit (leyö la carta); when a predication was durative, as in ‘be 
hungry’, the past L2 production would be with the Imperfect (tenia hambre). 
These types of constructions were found in both standardised tests and in free 
production data. In both cases the results were significant. 

The main conclusion concerning these results is that the use of a past tense 
form is influenced by the inherent aspect of the predication the learners want to 
produce in their interlanguage. In the second study (2018), the Present Perfect 
appears constantly in the informants’ interlanguage. The studies on Dutch learn- 
ers also lead to two important conclusions: first, free production tasks cannot be 
treated in the same way as standardised tests, where a clear choice is given to the 
informants. So, as van den Bergh/Rijklaarsdam (1999: 13) state: ‘the nature of 
writing processes is recursive and dynamic: different sub processes can and do 
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occur at any moment during the process’. Secondly, the overuse ofthe Perfect can 
be explained as L1 transfer (see section 3.3.1). 


3.3 Summary and comparison 


What this brief survey of previous studies has shown, is that both Dutch and 
German-speaking learners of Spanish as L2 show evident target-deviate patterns 
in the use and interpretation of past tenses. However, the deviations occur in very 
different ways, as we will show in the next section. 

Applying the idea by McManus (2015) that the nature of the aspectual con- 
trast needs to be in the focus of the investigation, it can be observed that the 
manner in which target-deviations occur are strikingly different. Thatis, although 
both learner groups fail to acquire the target system completely, they differ sig- 
nificantly in how this type of error’? manifests itself. In Spanish, the selection of 
an appropriate past tense form requires the consideration of the global context 
that defines the aspectual properties of the sentence. As the studies have shown, 
learners do not carry out this process entirely. In both cases, the studies seem to 
have detected compensating mechanisms based on explicitly learned rules that 
the learners develop to overcome the difficulties in processing the aspectual fea- 
tures in a target-like fashion. The learning strategies are based on radically sim- 
plified patterns, and it is precisely here where the differences are located: 
whereas German learners base their strategy on lexical elements, such as tempo- 
ral adverbials, Dutch learners rely on inherent aspect (durativity and terminativ- 
ity clues). 

These differences lead to a diagonally inverse behaviour in some contexts. 
Comparing the results of Diaubalick/Guijarro-Fuentes (2016, 2017) with Gonzalez 
(2003, 2013), German-speaking learners display a target-deviant behaviour when 
adverbials are misleading, whereas Dutch learners, in those sentences where 
inherent and grammatical aspect diverge, do not behave target-like even when 
the adverbials are facilitating. 

The finding that temporal markers affect learners’ behaviour is not new and 
has in fact been shown in numerous studies. Rothman (2008) claims that the 
pedagogical rules taught in class are applied in a stronger way than the learner’s 
own intuitions. Nonetheless, the findings reported here are different in some 
significant points from previous studies. For instance, the Dutch learners have 


19 We are aware of the negative connotations carried by this expression (see e.g. Cook 1997). It is 
our aim to simply attest a difference between the target system and the learners’ interlanguage. 
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shown that a helpful adverbial is not always considered, which is counterevidence 
for Rothman’s (2008) claim that the reliance on trigger words will overwrite the 
reliance on temporal markers. Furthermore, according to Baker/Quesada (2011), 
who base their arguments on findings concerning Anglophone learners, the effect 
exercised by temporal adverbials is generally weaker than the reliance on inherent 
aspect, which contrasts with the findings among the German learners presented 
above. Additionally, the effect was visible both in the interpretation and the pro- 
duction tasks. As a conclusion, it is safe to say that German learners base their 
decision on temporal markers which does not nullify an acquired competence, 
but rather compensates for the lack of it. For the Dutch group, these patterns were 
not observed. 

The role of the temporal markers is therefore crucial for the following argumen- 
tation, and can best be illustrated by sentences, in which inherent aspect and 
temporal markers do not trigger the same tense; that is, where the elements can 
be regarded as contradicting evidence. 

Consider the following example (taken from the multiple-choice task of 
Gonzalez 2003): 


(19) Ayer {pasaba/pasé} un rato en elcafé donde 
Yesterday spend.IMP/PRET one while in the cafe where 
Nuria {tomaba/tomö} el desayuno todos los domingos. 
Nuria take.IMP/PRET the breakfast all the Sundays. 


‘Yesterday I spent some time at the coffee house where Nuria had her 
breakfast every Sunday.’ 


In this case, ayer ‘yesterday’ is a known marker of the Preterit, whereas todos 
los domingos ‘every Sunday’ occurs mostly with the Imperfect. Considering this, 
the tense forms to be chosen should be first pasé (Preterit) and then tomaba 
(Imperfect). Given the context of the two events, this would be the expected 
answer. Conversely, the inherent lexical aspect of the two given predications in 
(18) hint in the opposite direction. Whereas pasar un rato ‘to spend some time’ is 
a durative predicate, tomar el desayuno ‘have breakfast’ is a terminative one. If the 
learning strategy is based on the correlation durativity-Imperfect, terminativity- 
Preterit, the learner would choose pasaba and tomö, that is, the opposite from 
what we would first expect. 

In (19), thus, the use of the temporal markers gives a helpful cue, whereas the 
reliance on inherent aspect leads to target-deviant answers. This explanation does 
not always apply, because in Spanish inherent aspect, lexical marker and the 
actual grammatical context (i.e., the (im)perfectivity of the verb phrase) are 
entirely independent. As shown in the studies above, a temporal marker is not 
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always helpful, but can be misleading. This is the case, when an adverbial indicat- 
ing completeness appears in an imperfective context, or ifan adverbial of durativity 
appears with a bounded event. Likewise, the inherent aspect can coincide with the 
grammatical one, but does not necessarily have to. 

Only the grammatical context determines the verb form, so by maintaining 
the terminology of helpful vs. misleading?°, we can categorize the possible combi- 
nations into four types: 


i. Helpful marker and helpful inherent aspect 
a. Ina perfective context: Preterit marker and terminative predicate 
b. In an imperfective context: Imperfect marker and durative predicate 


ii. Helpful marker and misleading inherent aspect 
a. Ina perfective context: Preterit marker, but durative predicate 
b. Inan imperfective context: Imperfect marker, but terminative predicate 


iii. Misleading marker and helpful inherent aspect 
a. Ina perfective context: Imperfect marker, though terminative predicate 
b. Inan imperfective context: Preterit marker, though durative predicate 


iv. Misleading marker and misleading inherent aspect 
a. Ina perfective context: Imperfect marker and durative predicate 
b. Inan imperfective context: Preterit marker and terminative predicate 


The following examples for a perfective context (analogous arguments hold for 
the imperfective context), where the adverbials are marked in bold, illustrate the 
four types: 


i. Ayer llegué a Londres. 
Yesterday I-arrive.PRET at London. 
‘Yesterday I arrived in London’ (Preterit marker, terminative predicate) 


20 As pointed out by an anonymous reviewer, another possible terminology here would be 
prototypical/non-prototypical. Given that these terms, however, are also tightly connected to 
the Lexical Aspect Hypothesis (see e.g. Bardovi-Harlig 2000: 218; Salaberry 2008: 14), we opted 
for the use of less prejudiced terms which, at the same time, reflect the deviations between 
explicit rule-based learning and the acquisition of the underlying aspectual contrasts. 
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ii. Ayer caminé por el parque. 
Yesterday I-walk.PRET through the park. 
‘Yesterday I walked in the park’ (Preterit marker, durative predicate) 


iii. En mi infancia abandoné mi patria. 
In my childhood I-leave.pRET my fatherland. 
‘I left my homeland during my childhood’ (Imperfect marker, terminative 


predicate) 
iv. Siempre tuve buenos amigos. 
Always I-have.PRET good friends. 


‘I always had good friends’ (Imperfect marker, durative predicate) 


Importantly, observing various combinations of items of the four types offer a 
methodological advantage, since they can reveal different learning strategies 
without having to contrast the learners’ production to that of L1 speakers. Thus, 
the risk of a high subjectivity (which plays a major role in grammatical aspect; 
see Salaberry 2008) can be avoided. 

This consideration is the key to the comparison of the studies mentioned 
earlier. Precisely in those cases where only one element is helpful and the other 
element is misleading, German-speaking and Dutch-speaking learners behave 
diagonally differently. That is, Dutch learners seem to adhere to inherent aspect 
(Gonzalez 2003, 2013), whereas German speakers focus their attention on adverbs 
(Diaubalick/Guijarro-Fuentes 2017; Diaubalick forthcoming). 

The striking differences derived from the studies are summarized in the fol- 
lowing table, revealing the distinct learning mechanisms: 


Table 3: Summary of previous results on SLA of L2 Spanish aspect by Dutch and German 
learners 


Results according to the relation ‘helpful marker’ ‘misleading marker’ 
between temporal adverbial, 
inherent aspect and target form. 


‘helpful inherent aspect’ Advanced learners ofboth German speakers diverge from 
L1 groups perform on a native group 
native-like level 


‘misleading inherent aspect’ Dutch speakers diverge Both learner groups diverge 
from native group from native speakers 
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3.4 Discussion and conclusion 


The comparison of the studies above suggests that the interlanguages of Dutch 
and German-speaking learners of Spanish differ considerably from each other. 
While it is true that the participants of the different studies were not on the same 
level of Spanish, in no study it was found that an augmenting proficiency would 
lead to crucially different learning patterns. The most probable reason for the 
observed differences between the groups thus lies in the L1-effect. Although this 
is an attempt to explain the differences between groups, we believe that the 
empirical data supports our argument. Future studies could address the question 
of why this effect manifests itself as it does. That is, we need to clarify two issues: 


(D Why do Dutch-speaking learners base their selection on inherent aspect, 
ignoring occasionally even helpful lexical triggers? 

(ID) Why do German-speaking learners behave in the opposite way, (i.e., why 
do misleading markers lead to target-deviant structures), and why don’t 
they follow inherent aspect clues, even if in some cases this would lead to 
a target-like behaviour? 


In both constellations, learners seem to have developed learning strategies which 
arguably serve to compensate difficulties with the acquisition of contrasting 
aspectual clues. Of course, it is likely that pedagogical input has led to the use of 
such strategies (see Cadierno 2000; Rothman 2008 for a defence of that position). 
Many text books of Spanish offer long lists of temporal markers (known as Signal- 
worter ‘signal word’ in Germany) based on which instructors deliberately try to 
simplify the complex selection task a learner must face. 

However, it is important to note that didactic traditions cannot be the main 
reason for the peculiarity of the German group, as similar instruction methods are 
also present in the Netherlands (and other countries world-wide), whereas the 
concept of inherent aspect, in contrast, is rarely mentioned (Gonzalez 2008). Thus, 
the mere assumption that learners behave as they do as a result of pedagogical 
methods cannot explain why Dutch-speaking learners base their learning strategy 
on a non-taught element, and even ignore markers in helpful contexts. A possible 
explanation for this fact is that Dutch learners of Spanish rely on their own aspec- 
tual clues (in this case inherent aspect) and apply them to the Spanish grammatical 
aspect contrast. 

In sum, pedagogical input cannot be the only explanation for the observed 
results. Since the learners in the studies presented here are generally comparable 
as to their age and education level (all participants are university students aged 
(insert average age)), the L1 seems to be an important factor that clearly distin- 
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guishes the groups. We hence argue that the different learning patterns are likely 
to be due to subtle differences in the grammars of German and Dutch. But, how 
can the different L1-effects be explained by pure linguistic data? 

It is here where the concept of micro- and macroparameters based on Roberts 
(2014) comes into play (recall section 2.1.4). As we have argued, the differences 
between German and Spanish on the one hand (macroparametric) and between 
Dutch and Spanish on the other hand (microparametric), are inherently diverse 
which logically amounts to saying that Dutch and German cannot be equivalent in 
their tense-aspect systems. This explains the behaviour of the learners investi- 
gated in the studies. Due to the basic aspectual encoding in their native language, 
Dutch learners are aware of the concept of grammatical aspect and so they know 
that it can be relevant for expressing a perspective or viewpoint. This seems to 
have a positive outcome for their sensitivity for aspectual markers in general. 
Although they do not achieve native-like competence for the organization of aspect 
in L2 Spanish (as their selection is not based on the notion of perfectivity), their 
learning strategy is indisputably based on aspect. The only “error””’, then, is that 
they choose inherent aspect instead of the grammatical one. This provides insight 
into issue (I) discussed at the beginning of this section. 

German learners, in contrast, do not consider any aspectual notion, i.e., 
their choice is neither based on grammatical nor on inherent features. Instead, 
the developed learning strategy is based on surface structure elements such as 
temporal adverbials. This is explainable by the lack of grammatical aspect in 
German which hinders the consideration of aspectual information at all, which 
is why the learners behave as stated in (II). 

This explanation, in turn, allows to confirm the assumed properties of the 
German and the Dutch tense and aspect system and thus shows how the investi- 
gation of SLA can contribute to linguistic theory. The assumptions adopted here 
are compatible with our data: whereas the Dutch grammar features an aspectual 
contrast that simply does not coincide with the Spanish one, German” does not 
possess grammatical aspect features at all. That is, an interchange of the competing 
tense forms in German does not change the aspectual content of a sentence but is 
merely related to stylistic factors. In contrast to Dutch, a progressive form is neither 
grammaticalized nor consistently used (Krause 1997). The subtle differences, which 


21 The reason for our use of quotation marks in this context relates to the comment above. We 
do not want to deny a systematicity to the learners’ interlanguage, but simply attest a deviation 
from the native-like system. 

22 This affirmation concerns of course the spoken language from where a learner could possibly 
start with a transfer. 
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in direct translation between German and Dutch have little consequences, cause 
major inconsistencies in SLA of Spanish. 

Different from what Housen (2000) and Izquierdo/Collins (2008) state as a 
general conclusion, we have not found clear evidence that a greater L1-L2 differ- 
ences leads to a higher reliance on inherent aspect features. On the contrary, the 
German learners presented in the studies above did not seem to rely on aspectual 
features at all, although the L1-L2 difference is the largest one in this case. We can 
therefore conclude that the reliance can only take place if the learners are aware 
of the concept of (grammatical) aspect at all. This is only the case when the L1 
contains at least basic contrast, as in the grammars of English or Dutch. 

The comparison of different SLA studies has shown how empirical data in an 
applied field can be used to contribute important evidence for linguistic analysis. 
Future research should validate these arguments with the support from more 
experimental data on the subject both from a theoretical and from an applied 
point of view. The main conclusion drawn from the results presented above is that 
the Dutch and the German verb system differ in the grammaticalization of aspect 
and that this claim can explain the differences in behaviour of learners of L2 
Spanish. 
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Dietha Koster (Münster)/Hanneke Loerts (Groningen) 
Food for psycholinguistic thought on 
sender in Dutch and German: a literature 
review on L1 and L2 production and 
processing 


Abstract: The aim of this paper is to explore how variation in the expression of 
gender has been and can be exploited to study gender perception in speakers 
of Dutch and German. We provide an up-to-date literature review on descriptive 
and psycholinguistic research on gender for these languages, considering empir- 
ical studies on both native (L1) and second language (L2) acquisition. This paper 
contributes to placing existing literature on gender in Dutch and German in a 
comparative mode and to offering a concrete rationale (e.g., three lines of enquiry) 
to move the psycholinguistic study of language, cognition and gender forward. 


Zusammenfassung: Das Ziel dieses Papers ist es herauszufinden, wie sich Unter- 
schiede von Gender im Ausdruck von Sprache erfassen lassen und wie die Ergeb- 
nisse genutzt werden können, um die Perzeption von Gender bei niederländischen 
und deutschen Sprechern zu untersuchen. Vorgestellt wird eine aktuelle Übersicht 
der deskriptiven und psycholinguistischen Forschungsliteratur zum Thema Gen- 
der in diesen beiden Sprachen, wobei der Fokus auf empirischen Studien nach 
Erst- (L1) und Zweitspracherwerb (L2) liegt. Dieser Beitrag ermöglicht somit den 
Vergleich der vorhandenen deutschen und niederländischen Genderliteratur und 
liefert im Ergebnis konkrete Ansätze (z.B. drei Forschungsgebiete), um die psycho- 
linguistische Forschung von Sprache, Kognition und Gender weiter voranzubringen. 


1 Introduction 


An increasing number of studies investigates effects of gender language(s) on 
cognition (Garnham et al. 2016). Speakers of all languages are familiar with 
semantic gender, as in the words father and daughter, where the referent’s 
biological sex is encoded in the word’s meaning.’ The WALS sample suggests 
that 127 languages (43.6%) also have a lexical gender system that divides all 


1 We thank two anonymous reviewers for constructive criticism and comments. 
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nouns in the language into categories such as feminine, masculine, and neuter 
(Dryer/Haspelmath 2013 (eds.)).” These systems are often referred to as gram- 
matical gender systems since the genders of nouns are reflected in other words 
related to these nouns (henceforth, we employ the term lexical gender). Any- 
one who has ever learned a foreign language with lexical gender will be famil- 
iar with the seemingly arbitrary division of (especially inanimate) nouns across 
these categories. For example, why is the word for sun feminine in German (die 
Sonne) and the word for moon masculine (der Mond)? And why do the French 
refer to that same sun as masculine (le soleil), but to the moon as feminine (la 
lune)? Interestingly, even for animate nouns, there is not always an overlap 
between semantic and lexical gender. A telling example is the two-way Dutch 
sender system that used to consist of masculine, feminine, and neuter nouns, 
but now only contains common de-words and neuter het-words (De Vogelaer 
2012). There is thus no longer a lexical gender marker distinguishing the woman 
(de vrouw) from the man (de man). While such a distinction is still present in 
the three-way German gender system (die Frau [the woman] and der Mann [the 
man]), it is not always consistent with semantic gender. Men, for example, 
mostly belong to the masculine gender category (e.g., der Herr [sir]), but 
women - both in German and in Dutch - are also assigned to the neuter lexical 
gender category: das Mädchen; het meisje [the girl] and das Weib; het wijf [the 
woman]. 

Psycholinguistic research thus far suggests that both semantic and lexical 
gender have profound effects on how we perceive gender of animate and inani- 
mate entities. For example, Boroditsky/Schmidt/Philipps (2003) found that lex- 
ical gender affects whether German and Spanish speakers perceive objects as 
masculine or feminine, even without producing language. Yet, due to limita- 
tions and failures to replicate (Everett 2013), it is unclear whether such effects 
hold across languages and contexts. Semantic gender marked for professional 
titles affects whether women are perceived as relevant candidates for leader- 
ship positions. Horvath/Szcesny (2016) for example, found that a hiring com- 
mittee perceived female applicants as less fit to fill high-status leadership posi- 
tions, when a male noun was used generically with or without (m/f) (e.g., 
Geschäftsführer (m/w) [CEO]) in German job advertisements (also see Vermeu- 
len 2018). Women and men were perceived as equally fit when pair forms were 


2 Most languages contain some nouns that can be placed into more than one category, such as 
German die/das Cola [the coke] and die/das Email [the email], and Dutch de/het marsepein [the 
marzipan] and de/het matras [the mattress]. In most of these cases the preference for one gender 
or the other seems to be determined by geography or style (also see Semplicini 2012). 
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employed (e.g., Geschäftsführerin/Geschäftsführer [CEO-fem/male]). On the other 
hand, it has been shown that the use of German and Italian female professional 
titles decreases the value that participants ascribe to typically feminine profes- 
sions (Horvath et al. 2015). Whether and how feminizing or neutralizing prac- 
tices benefits gender fairness in different languages thus remains to be deter- 
mined. Though German semantic gender has been investigated quite extensively 
from a psycholinguistic perspective, research into Dutch gender, in compari- 
son, is less exhaustive. Many studies have focused on gender perception in 
adult, native (L1) speakers of a language, whereas only in more recent years, 
research on gender perception in younger and bilingual speakers has started to 
increase. 

The aim of the present paper is to provide an up-to-date literature review on 
descriptive and psycholinguistic gender research in Dutch and German. The 
overarching goal is moving the study of perception of gendered language in 
both L1 and second language (L2) speakers forward (as argued for in Pavlenko 
2014). We first focus on descriptive linguistic studies to identify similarities and 
differences in gender marking in grammar and for personal nouns. An impor- 
tant reason for focusing on personal nouns is that they constitute a basic and 
culturally significant lexical field in almost any language (Hellinger/Bußmann 
2001). Second, we focus on psycholinguistic work that has investigated the 
acquisition, processing and perception of gender in Dutch and/or German. We 
consider both studies that take offline measures (e.g., questionnaires, categori- 
zation tasks etcetera) as well as online measures (e.g., eye tracking and event- 
related brain potentials (ERPs)). We consider both research on gender percep- 
tion with child and adult L1 speakers as well as studies with L2 learners of Dutch 
and/or German. Studies into child L1 acquisition are of interest, as they show at 
what ages gender categories are formed and used during production and percep- 
tion. This is both of theoretical as well as descriptive value, as observed norms 
may be employed in educational and clinical practice. In line with Pavlenko 
(2014) we argue that L2 learners too constitute a favorable ground for testing 
psycholinguistic theory: “two languages in one mind” provide a unique test case 
for questions into, for example, changes in perception when shifting between 
languages. Perceptive shifts have been well-documented for, for example, 
motion perception across languages (Athanasopoulos et al. 2015), but for gender 
perception, bi- or multilingual studies have started to emerge only within the 
last decades. 
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2 The expression of gender in 
Dutch and German 


Hellinger/Bußmann (eds.) (2001, 2002, 2003) and Hellinger/Motschenbacher 
(eds.) (2015) have provided a systematic description of similarities and differ- 
ences in the expression of gender for over 40 languages. Drawing on Hellinger/ 
Bußmann (2001: 7), we define semantic gender as the specification of nouns “as 
carrying the semantic property [female] or [male]”, which may relate to the ref- 
erent’s sex. Lexical gender, on the other hand, can be defined as a noun classi- 
fication system, which “is reflected beyond the nouns themselves in modifica- 
tion required of ‘associated words’” (Corbett 1991: 4). The latter part of this 
definition refers to grammatical gender, the agreement system which, as 
argued by Corbett, is the most defining attribute of gender. Though Dutch and 
German are both West Germanic languages, we find differences in the expres- 
sion of semantic and lexical gender. Before discussing the semantic gender sys- 
tems, the next section reviews lexical gender in the two languages. The theoreti- 
cal distinction between semantic and lexical gender will not always be upheld in 
the following due to studies that incorporate both concepts. We use the term 
grammatical gender in case authors have employed this term themselves. 


2.1 Lexical gender in Dutch and German 


All nouns in Dutch and German, as in any language with a lexical gender system, 
belong to one of the gender categories of the language. There is sometimes a 
correspondence between the “feminine” and the “masculine” gender class and the 
specification of a noun as female-specific or male-specific (the German lexically 
masculine noun Mann [man] is preceded by the masculine article der [the]). For 
inanimate nouns, however, the gender category of a word often seems com- 
pletely arbitrary and can differ between languages, showing that lexical gender 
differs from biological gender. Gender comes from the Latin word la genus, 
which means “sort” or “class”. Languages with lexical gender are thus languages 
that have a small number of gender classes that nouns are assigned to, typically 
two or three: masculine, feminine and neuter. In addition to the assignment 
system, i.e., the classification of nouns belonging to a gender category, there is 
an agreement system. In languages such as German and Dutch, nouns do not 
necessarily carry markers of gender class membership, but there is obligatory 
agreement with other word classes, such as articles, adjectives, pronouns, verbs, 
numerals or prepositions. 
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2.1.1 Lexical gender in Dutch 


Dutch currently employs a two-way gender system but used to have a three-way 
lexical gender system with masculine, feminine, and neuter nouns.? The original 
masculine and feminine genders have evolved and collapsed into one common 
gender category and, consequently, this category is relatively large making up at 
least 75% of allnouns (van Berkum 1996: 23). Current Netherlandic Dutch thus 
employs a single definite determiner (e.g., de [the]) with nouns, that does not 
mark masculine or feminine gender (Gerritsen 2002; De Vogelaer 2012). Traces 
of masculine and feminine gender are still visible in references using personal 
(hij/hem [he/him] and zij/haar [she/her]) and possessive pronouns (zijn [his] 
and haar [her] (also for inanimates, see Audring 2006)). Furthermore, many 
substandard Belgian Dutch varieties still mark masculine, feminine and neuter 
gender. There is some evidence that in certain contexts, gender is transferred to 
German inanimate cognates when Belgian Dutch speakers are asked to assign Ger- 
man articles to given nouns (Vanhove 2017). 

Gender agreement in Dutch occurs with determiners, adjectives, and pronouns. 
In singular noun phrases (NPs), agreement is marked either on the determiner (in 
definite NPs) or on the adjective (in indefinite NPs), but never on both. As can be 
seen in Table 1, the attributive adjective always receives a schwa ending (-e), except 
in indefinite neuter NPs or NPs that are not preceded by determiners. Distinct gen- 
der marking disappears, i.e. is neutralized, when Dutch nouns are used in the plu- 
ral or in their diminutive form. All plural nouns receive the common determiner ‘de’ 
and the schwa ending on adjectives (e.g., de blauwe auto’s [the cars] and de blauwe 
boeken [the books]), and all diminutives receive neuter gender marking (e.g., het 
autootje [the car-little] and het boekje [the book-little] (Haeseryn et al. 1997). 


2.1.2 Lexical gender in German 


In German, nouns are marked as masculine, feminine, or neuter with the articles 
der (masculine), die (feminine) and das (neuter) (Bußmann/Hellinger 2003). While 
Dutch has a more frequent common category, German masculine seems to be slightly 
larger a category (38%) than feminine (35%) and neuter (26%) according to the CELEX 
corpus (Schiller/Caramazza 2003). German articles are not only declined for gender, 


3 As the examples of Oriya and English show, a gender system can erode (Oriya) and eventually 
be lost (English) (Hellinger/Bußmann (eds.) 2003). It has been argued that the present Dutch 
gender system is in state of erosion (De Vogelaer 2012; Loerts 2012). 
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but also for number and case. For example, der is not only employed as masculine 
nominative, but also as feminine dative and genitive as well as plural genitive. More- 
over, dieisnot only employed as feminine nominative, but also as feminine accusa- 
tive and plural nominative. Gender is thus neutralized in plural forms (Bußmann/ 
Hellinger 2003). Unlike case or number, however, lexical gender is an inherent 
property of the noun that controls agreement between that noun and syntactically 
related words. As can be seen in Table 1, in the singular nominative case, definite NPs 
and many pronouns, such as demonstrative pronouns, only show gender marking 
on the determiner. For indefinite NPs, gender marking is present on the adjective, but 
the indefinite determiner also receives the -e suffix when modifying feminine nouns. 
While the German system is rather straightforward in the definite singular 
nominative case, the system becomes less transparent for plurals and/or indefinite 
NPs. Like in Dutch, the plural definite determiner is not gender marked. Similarly, 
like Dutch, although more complex, German definite determiners are always 
declined in the definite NPs and for pronouns, while adjectives in indefinite NPs or 
NPs without determiners are not declined. For examples including the different 
cases, the reader is referred to Duden (2006) or Bergmann (2017: chapter 6). 


Table 1: Overview of the main agreement targets in Dutch and in the German singular 
nominative case (English equivalents in italic) 


Language Gender Definite Noun Phrase Indefinite Noun Phrase Pronouns 


Dutch Common de mooie auto een mooie auto die/deze mooie auto 
the beautiful car a beautiful car that beautiful car 
Neuter het mooie boek een mooi boek dat/dit mooie boek 
the beautiful book a beautiful book that beautiful book 
German Masculine der schöne Garten ein schöner Garten dieser schöne Garten 
the beautiful garden a beautiful garden that beautiful garden 
Feminine die schöne Schule eine schöne Schule diese schöne Schule 
the beautifulschool a beautiful school that beautiful school 
Neuter das schöne Buch ein schönes Buch dieses schöne Buch 


the beautiful book a beautiful book that beautiful book 
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2.1.3 Dutch versus German: differences in 
transparency and systematicity 


A clear difference between the Dutch and German gender systems concerns the 
different number and size of categories, but there are also differences in gender 
assignment. Three ways of assigning inanimate nouns to a gender category have 
been proposed in the literature: semantic, phonological and morphological 
(Corbett 1991; Mills 1986; van Berkum 1996). Examples of each of the three can be 
found in German, but in Dutch, mostly morphological rules are found (Booij 2002). 
Regarding semantics, for example, German nouns denoting birds are generally 
masculine while nouns denoting trees or flowers are generally feminine (Szagun 
et al. 2007). Although not as prominent as the rules in Romance languages, German 
also has some phonological cues. For example, words ending in nasal consonants 
are mostly masculine while words ending in -e and -ie are mostly feminine (Mills 
1986). Morphological cues in German nouns are, for example, the suffix -ling which 
can generally be associated with masculine gender and the suffix -keit generally 
predicting feminine gender (Corbett 1991; Köpcke/Zubin 2009). The Dutch gender 
assignment system also has a (small) number of suffixes that can be used to 
assign genders, in line with the original three-way distinction, such as -je for 
neuter gender (diminutives), -aard for masculine words and -heid for feminine 
words. These morphological rules, however, tend to include many exceptions 
(Booij 2002; Haeseryn et al. 1997). Furthermore, in both Dutch and German, ina 
small number of cases, lexical gender does not match with the biological gender 
of the referent (see Introduction). 

While pronouns in German still mostly agree with the lexical gender of the 
noun they refer to (albeit with some exceptions), the erosion of the Dutch system, 
i.e., the loss of the distinction between masculine and feminine nouns, seems to 
lead to the use of a more semantically based pronominal system. This shift is 
mostly in favor of the masculine category: many originally feminine nouns, e.g., 
zon [sun], are nowadays referred to with the masculine pronoun hij [he] in Nether- 
landic Dutch (Haeseryn et al. 1997). This is also the case for non-diminutive neuter 
nouns, e.g., het kind [the child]; het masker [the mask] (see Audring 2009). It 
should be noted, however, that Audring also found some degree of semantic 
agreement in the consistent German system, which is, like in Dutch, based on 
lexical gender (for animate nouns) or degrees of individuation of the referent (for 
inanimate nouns). Compared to German, however, where there is generally a 
one-to-one correspondence between articles and their related pronouns, Dutch 
is rather instable and variable: nouns do have a fixed gender, but they are not 
consistently used with the agreement pattern that corresponds with that gender 
category. In fact, all possible article-pronouns combinations seem to occur in 
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Netherlandic Dutch (Audring/Booij 2009). Because of the relatively scarce amount 
of gender cues on the noun itself, Dutch and, to a lesser degree, German, are both 
considered covert gender systems, i.e., systems in which the category to which a 
noun belongs can hardly be predicted based on phonological or morphological 
characteristics of that noun (e.g., Booij 2002; Corbett 1991; Haeseryn et al. 1997). 
The categories that nouns belong to thus largely must be acquired based on cues 
on agreement targets, but this agreement system is, especially in Dutch, not con- 
sistent, nor straightforward. Whether this has any consequences for L1 and/or L2 
learners will be discussed in section 3. 


2.2 Semantic gender for personal nouns 
in Dutch and German 


Semantic gender relates to the property of non-linguistic maleness or female- 
ness as encoded in a noun’s lexical meaning. Personal nouns may thus be 
marked as female-specific (e.g., dochter; Tochter [daughter]), male-specific (e.g., 
meneer; Herr [sir]) or gender-neutral (e.g., persoon; Person [person] (Hellinger/ 
Bußmann 2001). 


2.2.1 Gender confusion in Dutch 


Gerritsen (2002) writes that Dutch has two types of role nouns: terms that indicate 
the gender of the person who practices the role (male or female) and terms that 
do not. Not all personal masculines have a feminine counterpart. In this respect, 
Dutch differs substantially from German where almost all personal nouns refer- 
ring to men can be transformed into feminine equivalents by adding the suffix 
-in (Bußmann/Hellinger 2003). Furthermore, the number of suffixes that can be 
used to feminize personal nouns denoting men is larger in Dutch than in German 
(see Table 2). The most productive female-specific suffix in Dutch is -e, which espe- 
cially occurs with loans (assistent-assistente [assistant]); nouns ending in -ing 
(leerling-leerlinge [pupil]); and some other nouns (echtgenoot-echtgenote [spouse]). 
Other productive suffixes are: -ster (e.g., voorzitter, voorzitster [chairperson]); 
-euse and -trice (e.g., presentator, presentatrice [presenter]) -a (e.g., historicus, 
historica [historian]); -es and -esse (e.g., baron, barones/se [baron/baroness]) -in 
(e.g., boer, boerin [farmer]). Apart from male and female nouns, Gerritsen (2002) 
distinguishes a category with gender-neutral terms (with a masculine history) 
such as dokter [physician]; professor [professor]; psychiater [psychiatrist]. Note 
that many of these terms end in -er. Like German, Dutch allows for nominalization 
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of adjectives (de zieke [the sick person]) and verbs (de reiziger [the traveler]) that 
supposedly are gender-neutral as well. 

Gendered professional titles became subject of debate in the Netherlands in 
the 1970s (Kool-Smit 1967; Romein-Verschoor 1975), which has currently resur- 
faced (Peters 2016; Bolle 2017). In 1980, the law prescribed that in personnel 
advertisements, men and women should be equally visible. However, it has been 
found that different types of professional names were and are used unsystemat- 
ically (Gerritsen 2002). At the time, The Ministry of Social Affairs recommended 
(new) neutral terms (cf. Werkgroep 1982), but the media as well as linguistic cir- 
cles responded negatively. Formations such as timmer to refer to a carpenter 
(instead of timmerman or timmervrouw, which mark gender) were considered 
ridiculous and some linguists argued that the terms were not neutral at all, but 
only referred to men (e.g., dominee, minister, notaris [vicar, minister, notary]). 
Van Alphen (1983) and Huisman (1985) therefore advocated gender-specific 
names for professions. In the end, no official guidelines were determined, but 
the Dutch Language Union published a volume that presents social and linguis- 
tic information about the issue and possibilities to avoid linguistic sexism (de 
Caluwe/van Santen 2001). Recently, calls for gender-neutral language resurfaced 
in the Dutch media (Peters 2016; Bolle 2017; Meindertsma/de Bruijn 2017)’, 
although for many professional titles it is still far from clear whether they are 
(perceived as) gender-neutral. In terms of gender fairness, ample research sug- 
gests that female visibility in fe/male pair forms yields more gender fair percep- 
tions in Dutch than neutralizing terms (see section 2.2.3 below), but more 
research is needed to disentangle these effects. 


2.2.2 Female visibility in German 


Almost all German masculine personal nouns can be made feminine with the 
feminine suffix -in, as in Student (base, [student]) and Student-in [student-fe- 
male] (Bußmann/Hellinger 2003). The suffix -in is well established, has no neg- 
ative connotations and does not indicate lower status. Other feminizing strate- 
gies, that are more contested, are coordinated pair forms (Lehrerinnen und 
Lehrer [teachers-female and teachers-male]) or various forms of abbreviated 
splitting, sometimes with capital-I (Lehrer/Innen). Nouns ending in -er are 


4 In July 2016, the Dutch LGBOT organization announced that the pronoun hen [s/he] won the 
contest of gender-neutral alternative for hij [he] and zij [she] (Peters 2016). It is unknown whether 
and how this pronoun is used in current Dutch. 
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taken to be masculine, as in fahr- (base, [drive]), Fahr-er [driver-male] (Bußmann/ 
Hellinger 2003). The suffixes -ler and -ner are male-specific as well, but may, in 
addition, be used in generic contexts (i.e., to refer to women as well). There are 
hardly exceptions to the rule that anoun’s gender is invariant. Only a few nouns 
can be described as genuinely gender-indefinite, such as Person, Kraft, Mensch 
[person, -force, human]. Such nouns can become gendered though, through 
adjectival modification (e.g., eine weibliche/mdnnliche Person [a fe-/male per- 
son]). Further, there are some occupational terms which are compounds con- 
taining -mann or -frau as a second element (e.g., Kaufmann/Kauffrau [sales- 
man/-woman]). Nominalized adjectives and participles may be assigned either 
of the three genders by the choice of dependent categories. For example, the 
adjective krank [sick] can be nominalized into der Kranke [the sick person-male]) 
and die Kranke [the sick person-female] (Bußmann/Hellinger 2003). In contrast 
with Dutch, the definite article will mark gender after all for these German 
forms. 

Though English guidelines (UNESCO 1999, 2017) emphasize neutralization, 
German guidelines prioritize female visibility. Bußmann/Hellinger (2003) argue 
this is a consequence of several factors: the existence of lexical gender; the ten- 
dency towards its agreement with semantic gender; and the fact that derivation 
of feminine personal nouns is embedded in the German word-formation system. 
Female visibility is mandatory in gender-specific contexts but is also recom- 
mended in all contexts that include female referents. Alternative options are 
usage of gender-indefinite nouns as in Lehrpersonen, Lehrkräfte [teachers] or 
nominalized plural forms, which do not differentiate lexical and hence referen- 
tial gender in German, as in Auszubildende [trainees], Drogensiichtige [drug 
addicts]). In the singular, female visibility must be achieved. However, mascu- 
line forms that are part of inanimate compounds are not subject to change: 
Benutzerhandbuch [user manual]), Führerschein [driving license] (Bußmann/ 
Hellinger 2003). 


Table 2: Noun endings that have been reported to mark gender for Dutch and German singular 
role nouns 
Language Feminine ending(s) Masculine ending(s) Gender-neutral ending(s) 


Dutch -e-, -ing, -ster, -euse, -er -e;-er 
-trice, -a, -es, -esse 


German -in -er, -ler, -ner -e 
(e.g., generic in plural) 
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2.2.3 Male generics in Dutch and German 


As noted above, male generics have been an important topic of public debate 
for both Dutch and German. Male generics are (historically) masculine personal 
nouns (e.g., doctor, professor, psychiatrist) that are used either generically, i.e., 
referring to both women and men, or specifically, i.e. referring to only men. 
German research, that started in the 90s, has convincingly shown that generic 
use of masculine personal nouns is strongly male-biased (see Horvath et al. 
2015 for an overview). These studies arrive at convergent conclusions using dif- 
ferent techniques such as sentence completion tasks, reaction time measure- 
ments, reading tasks and questionnaires. For Dutch, it is claimed that mascu- 
line terms (e.g., medewerker [employee]) and neutralizing terms (e.g., nouns 
that have no feminine counterparts, arts [doctor], or are inherently gender-neu- 
tral (e.g., hoofd [head]), are increasingly used to refer to both women and men 
(Gerritsen 2002). This claim was supported by De Backer and De Cuypere (2012), 
who used survey experiments to investigate how German and Belgian Dutch 
speakers interpret masculine personal nouns used in a referential context. 
Results suggest that German masculine nouns are more restrictive in potential 
reference than Dutch nouns. This effect is more pronounced in the singular 
than in the plural. Not only number, but also lexical type played a role as non- 
occupational nouns tended to be more gender-neutral as compared with occu- 
pational nouns. 

Male- and female-specific forms versus male generics may have a severe 
impact on visibility and perception of (successful) women (Lassonde/O’Brien 
2013). Stahlberg/Szcesny/Braun (2001) for example, asked German participants 
to write down names of their favorite musicians or athletes. Participants read 
identical instructions with either a masculine only form (Sportler [male or generic 
athlete]) or a pair form (Sportlerin/Sportler [female/male athlete]). Results showed 
that more female personalities were listed in the pair form condition than in the 
masculine only condition. Similar results have been reported with German and 
Dutch speaking Belgian 6-year old school children. Vervecken/Hannover/Wolter 
(2013) examined these children’s perceptions of females’ and males’ success 
(i.e., who can succeed?) in traditionally male occupations (e.g., lawyer). Results 
showed that children who read pair forms perceived females’ and males’ success 
more equally than children who read the masculine form only. Male generics 
instead of pair forms may even have an impact on behavior in professional con- 
texts. In a hiring-simulation study in German, for example, decision makers pre- 
ferred male over female applicants for a high-status leadership position when 
the position was advertised in the masculine (Geschäftsführer, [CEO-male]) (Hor- 
vath/Sczesny 2016). 
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In sum, we can distinguish some important (a)symmetries in the expression 
of semantic gender in Dutch and German. First, where Dutch has about eight 
suffixes to mark whether a personal noun refers to females or not, German employs 
only one. German does have a few masculine markers for German singular nouns 
(-er, -ler and -ner), while the ending -er in Dutch marks masculinity or gender 
neutrality. These differences are likely bound to the erosion of the Dutch gender 
system and the lack of official Dutch guidelines for professional terms in contrast 
with clearer rules and priority of female visibility in German. Perception of male 
generics has been well-investigated for German, but should be examined in more 
detail for Dutch. 


3 Gender acquisition and processing 
in Dutch and German 


Differences in the expression of gender in Dutch and German and the instability of 
the Dutch gender may well lead to differences in both the processing and the 
acquisition of gender in the two languages, which may in turn explain observed 
patterns of language variation and change. Gender is also an intriguing phenome- 
non when related to bilingualism as it is realized differently in German and Dutch. 
Potential changes in perception when shifting language system can shed light on 
the question of whether it is possible to learn a completely new or different lexical 
gender system in later stages of life and what the influence of the L1 might be in 
this process. So far, we have discussed the Dutch and German systems and have 
addressed some studies that employed offline research techniques (question- 
naires, categorization tasks etcetera). Here we will also address studies that have 
employed online research techniques: eye tracking and ERPs. The advantage of 
the latter techniques over the former is that they measure online processes, i.e., 
processing of language as speech unfolds, instead of an answer or decision that 
cannot reveal the processes that led to the product. 


3.1 L1 Acquisition and the processing of gender 


When do Dutch and German children acquire their lexical gender system? While 
learners of German can rely on semantic, morphological, and phonological cues 
on the noun, learners of Dutch can only rely on morphosyntactic cues, i.e., the 
gender agreement system, to track down a noun’s gender category. As explained in 
section 2.1, however, these morphosyntactic cues are not consistently nor reliably 
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present in the Dutch input. Dutch common gender marking, for example, is used 
to conjugate words related to any noun used in the plural (as feminine die can 
precede plurals in German), while neuter gender marking can be used with both 
common and neuter nouns when they are used in diminutive form. The absence 
of cues and the absence of a one-on-one relationship between gender marking 
and gender in Dutch likely forces children to acquire Dutch gender word by word 
(Unsworth 2008). This section will investigate whether this is indeed the case by 
comparing studies looking at production, comprehension and online processing 
of lexical gender in Dutch and German. 


3.1.1 L1 acquisition of gender in Dutch versus German: 
production studies 


In line with the above reasoning, Dutch L1 speakers have indeed been shown to 
have difficulty acquiring Dutch gender and have been shown to overgeneralize 
the more frequent common determiner de until, and sometimes even beyond, the 
age of 6 (Blom/Polisenkä/Weerman 2008; Cornips/van der Hoek/Verwer 2006; 
Hulk/Cornips 2006a, b; De Houwer/Gillis 1998; van der Velde 2004). This acquisi- 
tion pattern is in stark contrast with the pattern observed in German native chil- 
dren, who have been shown to use the correct articles in their L1 at the age of 3 to 4 
(Mills 1986; Szagun et al. 2007). In contrast with Dutch children, German children 
make relatively few mistakes when producing gender marking and, when they 
do not know the gender of the noun, they tend to omit the articles more often 
than overgeneralize one of the available determiners (Mills 1986). The feminine 
definite singular and plural determiner die is, however, sometimes overgeneral- 
ized (Bewer 2004; Mills 1986), but the degree of overgeneralization is not compa- 
rable to the over-occurrence of common gender marking in Dutch. Moreover, sys- 
tematic overgeneralization is generally not reported for monolingual speakers of 
other gendered languages, such as French and Spanish (Franceschina 2005). 

In both Dutch and German, but also in many other gendered languages, 
agreement between the definite article and the noun is acquired before agree- 
ment between the adjective and the noun. Before being able to accurately and 
consistently use the agreement rules, learners thus seem to put knowledge of a 
noun’s gender in place first. The first phonetic rule for inanimate nouns that is 
represented in the German child’s vocabulary is the association of the -e ending 
with feminine gender (e.g., die Erde [the earth]), which appears to be the most 
frequent rule with the fewest exceptions. In line with the order of acquisition in 
Dutch, i.e, common before neuter and definite articles before adjectival inflection, 
Mills (1986: 85) concludes that the order of acquisition in German is related to the 
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scope ofthe rule and the number of exceptions. The asymmetry and the relatively 
scarce availability of rules and relatively frequent occurrence of exceptions to 
those rules in the Dutch gender assignment system is likely to partly explain the 
relatively late acquisition of Dutch gender as compared to German that is seen in 
production studies. 


3.1.2 L1 acquisition of gender in Dutch versus German: 
comprehension and processing 


In comparison with production studies, the relatively scarce amount of studies 
incorporating comprehension tasks have only partly added evidence to the rela- 
tively late acquisition of Dutch grammatical gender. Using a grammaticality 
judgement task with determiner noun combinations, Unsworth/Hulk (2010a, b) 
showed a mean accuracy of 70% for 4- to 6-year old Dutch children and, in line 
with production data, these children accepted ungrammatical neuter determiner 
noun combinations more often than ungrammatical common NPs. While there is 
not a lot of data from children between 6- and 11-years of age, Dutch children’s 
judgement of gender marked NPs has been shown to reach adult L1-like ceiling 
level around the age of 11 (Cornips/Hulk 2008). Interestingly, when metalinguis- 
tic knowledge is not tested, as is the case with a preferential looking paradigm, 
2-year old Dutch children have been found to use the common determiner de to 
more quickly locate a target object (Johnson 2005), which corroborates later 
findings using eye tracking with a visual world paradigm that common gender 
marking is used by adults to facilitate comprehension (Loerts/Wieling/Schmid 
2013). Although only found for common nouns in Dutch, such a gender effect is 
in line with studies showing that Spanish 2- to 4-year olds, like adults, can use 
informative gender marking to identify a target referent (Lew-Williams/Fernald 
2007, 2010). 

To further investigate the acquisition process during the apparent transition 
from non-targetlike to targetlike use of, particularly, neuter gender in Dutch, 
Brouwer/Sprenger/Unsworth (2017) recently tested 4- to 7-year olds using a visual 
world paradigm. They showed that children who correctly used neuter gender in 
production behaved like Dutch adults and used gender marking to anticipate, 
i.e., to predict, the upcoming noun. The children who still made a lot of neuter 
gender mistakes in production could use gender marking to facilitate comprehen- 
sion (i.e., to speed up comprehension after hearing the determiner), but not to 
anticipate upcoming targets. Contrary to the findings by Johnson (2005) and 
Loerts/Stowe/Schmid (2013) that only common gender might have such an effect, 
which has been explained by the fact that het is less informative as it can precede 
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any noun in its diminutive form, Brouwer and colleagues found the anticipatory 
effect only for neuter gender marking. While this difference might be affected by 
the visual world paradigm with 4 instead of 2 pictures used by Loerts/Stowe/ 
Schmid (2013), the combined results additionally reveal information about the 
potential developmental pattern of gender acquisition in Dutch. Brouwer/ 
Sprenger/Unsworth (2017) found that 2-year olds may first only use de (Johnson 
2005; Unsworth 2008) after which they acquire and temporarily only use hetina 
facilitative fashion before being able to also use gender marking anticipatorily 
(Brouwer/Sprenger/Unsworth 2017). The predictive power of gender in Dutch has 
not only been found to be asymmetrical when compared to other languages, but 
the effect also seems a lot smaller than the predictive anticipatory effects of all 
three gender marked articles that has been reported for German in adults (Hopp 
2016). To the authors’ knowledge, the use of German gender marking to predict 
upcoming nouns has only been tested and established for 8- and 9-year olds 
(Lemmerth/Hopp 2019), but not for younger children. 


3.2 Gender in L2 acquisition and processing 


3.2.1 L2 acquisition of gender in Dutch versus German: 
production studies 


For L2 Dutch, like L1 Dutch as discussed in section 3.1, speech production studies 
found that L2 learners overproduce de instead of het, as shown for Moroccan 
children and Moroccan, Turkish, English, Polish and deaf adults learning Dutch 
(Blom/Polisenkä/Weerman 2008; van Emmerik et al. 2009; Loerts 2012; Unsworth 
2008). An overview by Cornips/Hulk (2008) revealed that the overextension of 
de in Dutch also holds for children simultaneously acquiring Dutch with French, 
Akan, Ewe and Surinamese. They point towards a prominent role for “lengthy 
and intensive input” in explaining acquisition differences between less and 
more successful bilingual children.’ Lemhöfer/Spalek/Schriefers (2008) report that 
the L1 affected assignment of gender in the L2 for adult German learners of L2 
Dutch. This remained the case even after receiving training on gender assignment 
(Lemhöfer/Schriefers/Hanique 2010). 

For L2 German, production difficulties have been reported for English, Afri- 
kaans and Italian adult speakers (Bianchi 2013; Bobb/Kroll/Jackson 2015; Bordag/ 


5 Cornips (2008) also points out that young immigrants in the Netherlands are consciously over- 
extending Dutch de [the] as an identity marker. 
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Opitz/Pechmann 2006; Ellis/Conradie/Huddlestone 2012). Ellis/Conradie/Huddle- 
stone (2012) found that Italian learners outperformed English and Afrikaans speak- 
ers, which they attribute to “deep” L1 transfer of grammatical gender (since Italian 
and German both know lexical gender, but the systems are not congruent). Bianchi 
(2013) also points to language-internal factors, as well as amount of input, to 
explain deviations from target gender assignment for Italian-German bilinguals. 
Eichler/Jansen/Miiller (2012) report on bilingual child acquisition of German with 
French, Italian and Spanish and showed that bilingual children can acquire gender 
systems in both languages without any delays. Yet children’s accuracy can be pre- 
dicted based on language dominance and German is the most problematic system 
to acquire. The authors found that accuracy on neuter gender is lower in bilingual 
than monolingual German, suggesting that simultaneous acquisition of a two- and 
three-gender system has delaying effects for target-like neuter marking. Salamoura/ 
Williams (2007) also found L1 effects for adult Greek learners in L2 German gender 
assignment. Nouns that had the same gender in both languages were translated 
faster than nouns with different genders, and gender-incongruent cognates yielded 
more errors. To our knowledge, few studies have focused on L2 production of gen- 
dered personal nouns (but see Urbanek et al. 2017; De Vogelaer et al. (this volume), 
both showing transfer effects in German L2 Dutch learners’ pronoun use). 


3.2.2 L2 acquisition of gender in Dutch versus German: 
comprehension and processing 


Hopp (2013) examined how English L2 German learners assign gendered deter- 
miners to inanimate nouns and how these are comprehended in real-time. Results 
showed contingencies between accuracy in lexical assignment in production and 
target-like processing in comprehension (as measured using eye tracking). 
According to Hopp, this argues against a representational and processing deficit 
in late L2 learners. Eye tracking studies using a visual world paradigm, like Hopp 
(2013), have examined whether L2 learners, like native adults (see section 3.1.2), 
use grammatical gender to facilitate L2 comprehension. Results suggest that both 
L2 Dutch as well as L2 German learners experience challenges in this respect. 
Loerts (2012) for example, found that Polish (a language with no articles) late 
learners of L2 Dutch cannot pre-activate L2 grammatical gender information (in 
articles) to facilitate comprehension of inanimate (object) nouns nor do they 
(in)accurately use L1 gender categories (as previously found by e.g., Weber/Paris 
2004). For L2 German, using a similar paradigm, Hopp (2016) investigated whether 
English L2 German learners use grammatical gender for predictive processing and 
found that knowledge of gender assignment is a prerequisite for using gender 
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marking predictively. Non-target gender assignment led to erroneous gender pre- 
diction and was therefore abandoned. This effect was replicated with L1 German 
speakers when they hear input with gender mistakes: the low reliability in tar- 
get-like gender assignment leads them to abandon the use of grammatical gender 
as predictive cue (Hopp 2016). Sabourin/Stowe/de Haan (2006) examined gen- 
der assignment for advanced German, English and Romance learners ofL2 Dutch. 
They concluded that L2 acquisition of lexical gender was mostly affected by mor- 
phological similarity of gender marking in the L1, with high accuracy for German 
speakers as there is relatively more overlap between German and Dutch. 

The online processing of gender has more recently increasingly been assessed 
using event-related potentials (ERPs), which are changes in brain activity that 
show when specific aspects of language are processed. This technique has 
repeatedly shown that semantic violations or unexpectancies are generally 
evoking an N400, a negative going wave around 400 ms, while morphological 
and syntactic violations and difficulties generally elicit a positive deflection 
around 600 ms, known as the P600. Grammatical gender violations have consist- 
ently been found to elicit a P600 reflecting syntactic re-analysis or repair in various 
languages including Dutch (e.g., Loerts/Stowe/Schmid 2013) and German (Gunter/ 
Friederici/Schriefers 2000; Davidson/Indefrey 2009). Late L2 learners’ process- 
ing of L2 gender violations has been shown to be affected by L1, with both the 
presence as well as the similarity of gender systems being a prerequisite for native- 
like processing (Sabourin/Stowe 2008). Interestingly, highly proficient L2 learners 
with a completely different gender system (i.e, Polish) have been found to show a 
reduced and delayed P600, but only in response to violations of common gender 
(Loerts 2012). This latter result is in line with many of the other studies discussed 
above showing difficulty especially in the acquisition of Dutch neuter gender. 

An understudied group of learners, especially with the use of online meas- 
ures, are early bilinguals. The general idea is that simultaneous bilinguals will 
eventually catch up with their monolinguals peers, but an ERP study on Dutch 
gender showed that simultaneous bilinguals with Turkish (which has no gender) 
and Dutch as L1s only show a reduced P600 when compared to their “monolingual” 
peers with over half of them performing at chance level when judging grammatical 
gender violations offline (Seton 2011). Similarly, while 8-year old simultaneous 
Russian-German bilinguals have been shown to be able to use German gender to 
predict upcoming words, their early successive bilingual peers could only use Ger- 
man gender cues to predict nouns if they shared gender in German and Russian 
(Lemmerth/Hopp 2019). These combined results suggest an important role for the 
L1 that requires more attention in future research. 

Another line of research focused on pronominal gender for nouns (e.g., 
Lamers et al. 2008). Ellert (2011) studied relative pronoun resolution (e.g., hij/ 
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er [he]; die/der [he]) in L1 and L2 Dutch and German discourse for both animate 
(e.g., Peter) and inanimate nouns, using ERPs. She found that pronoun resolution 
was affected by order of mention and information structure of the antecedent 
clause, but animacy had no effect in L1 Dutch. This indicates that personal and 
demonstrative pronouns exhibit the same level of (non)ambiguity in Dutch, 
which contrasts with German, where personal pronouns were resolved earlier 
after animate antecedents. Proficiency level was an important predictor for 
native-like processing, but both Dutch L2 German as well as German L2 Dutch 
learners showed both timing as well as resolution differences for animate entities 
when compared with L1 speakers. Urbanek et al. (2017) investigated production 
and perception of pronominal gender in German L2 Dutch learners for concrete 
mass nouns (e.g., sugar, grass etc.). In current Netherlandic Dutch, the domi- 
nant type of pronoun in reference for de-words is hij [he] (Haeseryn et al. 1997). 
One result showed that in contrast with L1 Dutch speakers, German L2 Dutch 
learners used zij/ze [she] to describe de-words with feminine German counter- 
parts in 64.9 percent of the cases, which can be explained by L1 transfer. Yet, 
acceptability data showed that, in contrast with female preference in the pro- 
duction data, variation in pronoun use was deemed acceptable by L2 learners. 
De Vogelaer et al. (this volume) present new data of Austrian and German learn- 
ers of L2 Dutch on the same topic. 

Comprehension of animate nouns seems less well-investigated in L2 acquisi- 
tion. For example, we know of no studies investigating L2 acquisition of gender 
marking on personal noun endings (as outlined Table 2). Though these words 
appear frequently in the input (Gerritsen 2002; Hellinger/Motschenbacher (eds.) 
2015) and adult L2 learners will encounter them, for example, in L2 textbook 
chapters. Some work on German has focused on the interaction of lexical gender 
and names for role nouns or nouns for stereotypical fe/male professions. Sato/ 
Gabriel/Gygax (2016), using a sentence evaluation task, examined how nominal- 
ized adjectives (e.g., die Konsumierenden [those that consume]) with grammati- 
cally masculine nouns (e.g., die Kdufer [the buyers]) induce male representations in 
French-German bilinguals. They showed that a masculine bias persisted when 
participants read masculine plural forms, but that nominalized forms can atten- 
uate this male bias, even for nonnative speakers. In another study, Sato/Gygax/ 
Gabriel (2016) investigated French-German bilingual speakers with a visual 
world eye tracking task. They presented speakers with stereotypical, plural role 
nouns and plural determiners that have a generic meaning in French and a fem- 
inine connotation in German (e.g., les techniciens, die Techniker [the techni- 
cians]). Participants judged whether a pair of faces showing two men or a man 
and a woman could represent the presented language. Results showed no effects 
of the determiner, but an interaction of face pairs and stereotypes, where the 


Food for psycholinguistic thought on gender in Dutch and German — 347 


preference for male face pairs that followed stereotypical male nouns was the 
most pronounced. This aligns with claims that L2 learners rely primarily on 
non-structural lexical semantic and pragmatic cues when comprehending their 
L2 (Clahsen/Felser 2006). 

In sum, grammatical gender production difficulties seem to occur both in L1 
and L2 Dutch and L2 German, and these difficulties seem to be related to profi- 
ciency, input and the presence and degree of similarity of an L1 gender system. 
L2 learners can use grammatical gender to facilitate comprehension, but only 
when they are highly proficient and hardly make mistakes. While late learners 
have been studied relatively extensively in this field of research, little is still 
known about comprehension and processing of gender in early bilinguals. For 
personal (pro)nouns, production results show that German L2 Dutch learners 
use female pronouns to refer to nouns with common gender, while L1 Dutch 
speakers use male pronouns, which can be explained by L1 transfer. Research 
has also shown that for resolution of pronouns, animacy affects how (fast) L1 
and L2 Dutch and German speakers resolve comprehension. When interpreting 
professional names, stereotypical knowledge affects speakers’ representations 
to a much larger degree that grammatical knowledge. It has also been shown 
that gender-neutral nominalized forms, can attenuate male bias, suggesting that 
human gender categories in language can be (un)learned. 


4 Gaps in psycholinguistic, Dutch-German 
sender research 


The previous sections have outlined descriptive and psycholinguistic research 
into gender in Dutch and German. When it comes to research into bilingualism, 
language and gender in general, Pavlenko (2014: xiii) described that “the research 
on (grammatical) gender [...] is limited to a handful of psycholinguistic studies 
documenting effects (or lack thereof) in artificial tasks and it is not clear what, if 
any, implications these findings have for habitual thought”. The special issue by 
Garnham et al. (2016), which addresses effects of semantic gender in German, but 
not in Dutch speaker cognition, and themes such as gender-neutral language and 
its effects on reducing stereotyping and discrimination (Szcesny/Formanowicz/ 
Moser 2016), shows that in this area, many questions concerning both L1 and L2 
speakers remain unanswered as well. In addition, recent public debates on gender- 
neutral language in German and Dutch (Vermeulen 2018) indicate the necessity 
of further research. The present section therefore presents impulses regarding 
content and methodology for future studies. 
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4.1 Interpretation of personal nouns in 
L1 and L2 Dutch 


With respect to semantic gender, we find that, in contrast to gendered definite 
articles (e.g., German der/die/das (the)) and inanimate nouns, gender for animate 
(role) nouns (e.g., nurse, technician) has been less frequently researched, espe- 
cially for (Netherlandic) Dutch. In addition, there are few Dutch studies into 
so-called male generics (e.g., Dutch burgemeester [mayor]), which contrasts 
with the substantial German body of research on the topic. For the Dutch public 
to determine what is “gender-neutral” language, we need more studies. We 
need, for example, studies like the one by De Backer/De Cuypere (2012), that 
identify non-/occupational terms and whether they are perceived as fe/male or 
gender-neutral, covering a large number and range of terms (De Backer/De 
Cuypere investigated only sixteen Belgian-Dutch terms). This can be done by 
survey experiments, like De Backer/De Cuypere (2012) did, or sentence completion 
paradigms, reading tasks and reaction time measurements (e.g., see German 
studies like Horvath et al. 2016). However, online measures, for example eye 
tracking, provide an excellent means too to discover whether individuals look 
towards fe/male persons upon hearing a given role noun (see Sato/Gygax/Gabriel 
2016). Further, studies that reveal effects of male generics versus gender-neutral 
alternatives on decision making processes are needed for Dutch. The hiring 
simulation study for German by Horvath/Szcesny (2016) for example, could be 
replicated for Dutch; or the study into effects of language on perception of fe-/ 
male success by Vervecken/Hannover/Wolter (2013) could be replicated with Dutch 
children. Focusing on L2 learners, we need to identify whether and what forms 
are presented to them in what manner in L2 Dutch and German text books (see 
Koster/Iding 2019); whether fe-/male noun endings are comprehended as such 
(with/out instruction) and whether target-like comprehension is more easily 
established when L1 categories are similar to L2 categories (e.g., male -er noun 
ending in Dutch vs. German) as compared with different L2 categories (e.g., 
Dutch female noun endings -e, -ing, -ster, -euse, -trice, -a, -es, -esse vs. German 
female ending -in). 


4.2 Acquisition of gender in Dutch vs. German 


The acquisition of Dutch lexical gender seems, when compared to German, rela- 
tively difficult for both L1 speakers and L2 learners. Most studies have, however, 
looked at production data only, and there is evidence suggesting that compre- 
hension and production of gender may not always go hand in hand. Ideally, 
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future studies should combine both production and comprehension measures, 
but also incorporate more longitudinal designs with emphasis on the age range 
from 4 to 11in which Dutch gender acquisition appears to move to a target-like 
system. Comprehension studies should ideally resemble real life, thus covering 
larger pieces of text instead of words and single sentences. Simultaneous and 
early successive bilinguals are an understudied group and their processing of 
language has long been thought to (eventually) match those of monolinguals. 
Recent studies investigating brain activity in response to gender violations suggest 
that this is not always the case (Seton 2011) and the L1 of these early bilinguals 
seems to have an impact (Lemmerth/Hopp 2019). Although many studies focusing 
on late L2 learners have already pointed towards the crucial role of the L1 (e.g., 
Sabourin/Stowe 2008), these studies often suffer from high interindividual hetero- 
geneity (e.g., age of acquisition, L2 proficiency, aptitude, etc.). More studies with 
successive, but especially with simultaneous bilinguals are needed as these can 
provide important evidence concerning the impact of one language on another 
language during different stages of acquisition. The role of stereotypical knowledge 
about fe/male roles is another important topic to be explored further, with potential 
for online measures. As ERPs can reveal the activation of stereotypical informa- 
tion (e.g., Misersky 2017), they could be exploited to reveal whether certain non-/ 
stereotypical nouns trigger certain brain patterns in Dutch and German as well. 
Furthermore, such measures might eventually be used to examine potential 
transfer effects of cultural and linguistic stereotypical information in L2 learners 
of Dutch and German. 


4.3 L2 Acquisition of gender using online measures 


Research on L2 acquisition with online processing measures has not emerged 
until recently for both languages. It is recommendable that more different lan- 
guage combinations are examined, as the lack of or nature of a gender system in 
the L1 is an important candidate for positive or negative transfer. Direct compari- 
sons of Dutch and German and potential lexical activation of L1 categories during 
the processing of L2 in different stages are needed. An interesting candidate for 
this is the interlingual homophone die, combined with gender-neutral personal 
nouns, that functions as grammatically female definite article in German (e.g., 
die Blinde [the blind person]) and as gender-neutral demonstrative in Dutch (e.g, 
die blinde [that blind person]; see Table 1). In case German L2 Dutch learners 
adhere to L1 German grammatical gender when comprehending L2 Dutch die 
blinde, an eye tracking visual world study could reveal whether a female person 
is the preferred candidate to fixate (Koster et al. 2019). 
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In addition, it could be further explored whether and to what degree L2 learn- 
ers use their L1 gender system to predict upcoming nouns and what the influence 
in of similarities between the gender systems. Some visual world paradigm stud- 
ies with other languages than Dutch and German have shown that L1 gender cat- 
egories are transferred to anticipate upcoming nouns in the L2 (Weber/Paris 
2004), but similar transfer effects were not found for late Polish L2 Dutch learners 
(Loerts 2012). Whether it is the typological difference related to the absence of 
determiners in Polish or lack of overlap in gender categories that has caused the 
absence of L1 transfer could be studied in more detail in the future. For example, 
future studies could determine whether L2 Dutch learners with Lis that do mark 
gender on articles, but that differ in terms of their similarity to Dutch (e.g, Italian, 
Spanish, French, German) face fewer difficulties with Dutch gender than Polish 
learners as shown by online measures revealing the processing of language as the 
input unfolds. Finally, we need more real-time studies on the processing of mor- 
phosyntactic cues in different types of bilingual children, especially for simulta- 
neous and successive bilinguals. 


5 Conclusions 


The aim of the present paper was to provide an up-to-date literature review on 
linguistic gender in Dutch and German and provide impulses for further psycho- 
linguistic work. We have discussed feminine suns and masculine moons; German 
women that are perceived as less fit job candidates because of male “generic” 
language; German and Dutch speaking children that face difficulties when they 
must describe objects with gendered articles; and gender similarities and differ- 
ences across German and Dutch, that may benefit, but also confuse adult foreign 
or second language learners. Gaining a better understanding of the mechanisms 
and processes that underlie these outcomes can help, for example, in the design 
of L2 curricula or governmental or newspaper language policies concerning 
inclusive language. But most of all, present and future psycholinguistic studies 
into gendered language, can provide us with insight into our own language 
behaviors and that of people around us. In this way, we can deal with everyone’s 
linguistic and cognitive challenges in a knowledgeable manner. 
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